Healthcare Claims Fraud Detection
Use the ScamVerify™ API to detect fraudulent healthcare claims through document analysis, provider phone verification, and multi-channel cross-referencing.
Healthcare fraud cost the U.S. healthcare system an estimated $100 billion annually. In fiscal year 2025, the Department of Justice recovered $5.7 billion through False Claims Act cases, a record, with 83% originating from healthcare. A single genetic testing fraud scheme used fake enrollment documents to bill $2.7 million. AI-generated invoices and provider letters are becoming increasingly difficult for human reviewers to catch. The ScamVerify™ API provides multi-channel verification that detects fraudulent claims at intake, before payment is issued.
Fraud Patterns in Healthcare Claims
Document-Based Fraud
Fraudulent claims often include fabricated supporting documents:
- Fake provider invoices with inflated charges, non-existent procedure codes, or AI-generated letterheads
- Forged enrollment forms with fabricated patient signatures and provider credentials
- Counterfeit provider letters claiming medical necessity for expensive procedures
- Fictitious facility documents listing addresses that are commercial mailboxes or vacant lots
Provider Infrastructure Fraud
Fraudulent providers set up disposable infrastructure that shares common characteristics:
- VoIP phone numbers on high-risk carriers, making it cheap to create and abandon provider phone lines
- Newly registered domains for provider websites, often created days before claims are submitted
- CMRA addresses (commercial mail receiving agencies like UPS Stores) listed as the provider facility address
- Mismatched provider information where the phone carrier, domain registrar, and address do not align with a legitimate medical practice
The Multi-Channel Signal
No single signal is conclusive on its own. A VoIP phone number could belong to a legitimate telehealth provider. A new domain could be a recently opened practice. But when a claim includes a provider with a VoIP number on a high-risk carrier, a website registered 2 days ago, and a facility address at a commercial mailbox, the combined signal is strong.
Claims Intake Processor
The highest-value integration point is at claims intake, before an adjuster spends time reviewing the claim and before any payment is authorized.
import requests
import os
from dataclasses import dataclass, field
SCAMVERIFY_API_KEY = os.environ["SCAMVERIFY_API_KEY"]
BASE_URL = "https://scamverify.ai/api/v1"
@dataclass
class ClaimVerification:
claim_id: str
risk_level: str = "low"
risk_score: int = 0
flags: list = field(default_factory=list)
document_result: dict = field(default_factory=dict)
phone_result: dict = field(default_factory=dict)
url_result: dict = field(default_factory=dict)
recommended_action: str = "approve"
def verify_claim(claim_id: str, document_path: str = None,
provider_phone: str = None, provider_url: str = None) -> ClaimVerification:
"""
Run multi-channel verification on a healthcare claim.
Checks submitted documents, provider phone numbers, and provider websites.
"""
verification = ClaimVerification(claim_id=claim_id)
scores = []
# Channel 1: Document analysis (invoices, enrollment forms, provider letters)
if document_path:
try:
with open(document_path, "rb") as f:
doc_response = requests.post(
f"{BASE_URL}/document/analyze",
headers={"Authorization": f"Bearer {SCAMVERIFY_API_KEY}"},
files={"file": (document_path.split("/")[-1], f)},
)
doc_response.raise_for_status()
doc_result = doc_response.json()
verification.document_result = {
"document_type": doc_result["document_type"],
"risk_score": doc_result["risk_score"],
"verdict": doc_result["verdict"],
"red_flags": doc_result["red_flags"],
"entity_verifications": doc_result["entity_verifications"],
}
scores.append(doc_result["risk_score"])
# Check for CMRA (commercial mailbox) address
for addr in doc_result.get("entity_verifications", {}).get("addresses", []):
if addr.get("is_cmra"):
verification.flags.append("CMRA_PROVIDER_ADDRESS")
if not addr.get("institution_matches"):
verification.flags.append("INSTITUTION_MISMATCH")
# Check for unverified officials
for official in doc_result.get("entity_verifications", {}).get("officials", []):
if official.get("match_confidence") == "none":
verification.flags.append(f"UNVERIFIED_OFFICIAL:{official['name']}")
except requests.HTTPError as e:
verification.flags.append(f"DOCUMENT_CHECK_FAILED:{e.response.status_code}")
# Channel 2: Provider phone verification
if provider_phone:
try:
phone_response = requests.post(
f"{BASE_URL}/phone/lookup",
headers={
"Authorization": f"Bearer {SCAMVERIFY_API_KEY}",
"Content-Type": "application/json",
},
json={"phone_number": provider_phone},
)
phone_response.raise_for_status()
phone_result = phone_response.json()
verification.phone_result = {
"risk_score": phone_result["risk_score"],
"verdict": phone_result["verdict"],
"line_type": phone_result["signals"]["line_type"],
"carrier": phone_result["signals"]["carrier"],
"high_risk_carrier": phone_result["signals"]["high_risk_carrier"],
"ftc_complaints": phone_result["signals"]["ftc_complaints"],
}
scores.append(phone_result["risk_score"])
if phone_result["signals"]["high_risk_carrier"]:
verification.flags.append("HIGH_RISK_VOIP_CARRIER")
if phone_result["signals"]["ftc_complaints"] > 0:
verification.flags.append("FTC_COMPLAINTS_ON_PROVIDER_PHONE")
if phone_result["signals"]["robocall_flagged"]:
verification.flags.append("ROBOCALL_ASSOCIATION")
except requests.HTTPError as e:
verification.flags.append(f"PHONE_CHECK_FAILED:{e.response.status_code}")
# Channel 3: Provider website verification
if provider_url:
try:
url_response = requests.post(
f"{BASE_URL}/url/lookup",
headers={
"Authorization": f"Bearer {SCAMVERIFY_API_KEY}",
"Content-Type": "application/json",
},
json={"url": provider_url},
)
url_response.raise_for_status()
url_result = url_response.json()
verification.url_result = {
"risk_score": url_result["risk_score"],
"verdict": url_result["verdict"],
"domain_age_days": url_result["signals"]["domain_age_days"],
"brand_impersonation": url_result["signals"].get("brand_impersonation"),
}
scores.append(url_result["risk_score"])
domain_age = url_result["signals"]["domain_age_days"]
if domain_age is not None and domain_age < 90:
verification.flags.append(f"NEW_PROVIDER_DOMAIN:{domain_age}_DAYS")
if url_result["signals"].get("brand_impersonation", {}).get("detected"):
verification.flags.append("BRAND_IMPERSONATION_ON_PROVIDER_SITE")
except requests.HTTPError as e:
verification.flags.append(f"URL_CHECK_FAILED:{e.response.status_code}")
# Calculate combined risk
if scores:
verification.risk_score = max(scores)
if verification.risk_score >= 70 or len(verification.flags) >= 3:
verification.risk_level = "high"
verification.recommended_action = "deny_pending_investigation"
elif verification.risk_score >= 40 or len(verification.flags) >= 1:
verification.risk_level = "medium"
verification.recommended_action = "escalate_to_siu"
else:
verification.risk_level = "low"
verification.recommended_action = "approve"
return verification
# Usage: process a claim with all available data
result = verify_claim(
claim_id="CLM-2026-00847",
document_path="/claims/intake/invoice-847.pdf",
provider_phone="+12125559876",
provider_url="https://advancedgenomics-lab.com",
)
print(f"Claim {result.claim_id}: {result.risk_level.upper()}")
print(f"Risk score: {result.risk_score}")
print(f"Action: {result.recommended_action}")
for flag in result.flags:
print(f" Flag: {flag}")Multi-Channel Cross-Referencing
The strongest fraud detection comes from following the chain of entities across channels. A claim document contains a provider phone number and website. Each of those can be independently verified, and the results reinforce each other.
Claim document uploaded
|
v
Document analysis extracts:
- Provider: "Advanced Genomics Laboratory"
- Address: "8400 Medical Pkwy, Suite 2200, Dallas, TX"
- Phone: "(214) 555-0341"
- Website: "advancedgenomics-lab.com"
- NPI: 1234567890
|
v
Cross-channel verification:
├── Document: Address is CMRA (mailbox center) ← RED FLAG
├── Document: No lab found at this address ← RED FLAG
├── Phone: VoIP on Bandwidth.com (high-risk carrier) ← RED FLAG
├── Phone: 0 FTC complaints (clean)
├── URL: Domain registered 8 days ago ← RED FLAG
├── URL: No brand impersonation (clean)
└── Combined: 4 flags across 3 channels
|
v
Result: risk_level "high"
Action: "deny_pending_investigation"
Rationale: Provider facility is a commercial mailbox, phone
is a disposable VoIP line, and website was created last week.
Pattern is consistent with a shell provider operation.Risk Scoring for Healthcare Claims
| Signal | Weight | Description |
|---|---|---|
| CMRA provider address | High | Provider's facility address is a commercial mailbox (UPS Store, Mailboxes Etc.). Legitimate medical practices do not operate from mailbox stores. |
| High-risk VoIP carrier | High | Provider phone number is on a carrier disproportionately used in fraud operations |
| Provider domain under 90 days old | High | Medical practices typically have established web presences. A domain created in the last 3 months is unusual. |
| FTC complaints on provider phone | Critical | The provider's listed phone number has been reported to the FTC for scam activity |
| Unverified official on documents | Medium | Named physician, administrator, or director not found in available databases |
| Mismatched institution | High | Claimed facility does not exist at the stated address per Google Places verification |
| Spelling errors in official documents | Medium | Misspelled medical terms, procedure names, or facility names on invoices and letters |
| Brand impersonation on provider site | Critical | Provider website mimics a known healthcare brand or insurance portal |
| Multiple channels flagged | Critical | When 3 or more independent channels (document, phone, URL) each produce flags, the probability of fraud increases significantly |
Healthcare claims data is protected under HIPAA. Document images submitted for analysis may contain Protected Health Information (PHI) including patient names, dates of birth, diagnosis codes, and insurance identifiers. Organizations using the ScamVerify™ API for claims verification must ensure their implementation complies with HIPAA requirements for data transmission, storage, and access controls. ScamVerify™ does not store original document images after analysis. A Business Associate Agreement (BAA) is available for Enterprise plan customers.
Best Practices for Healthcare Claims Fraud Detection
- Integrate at claims intake, not post-payment. Every dollar paid on a fraudulent claim is a dollar that must be recovered through litigation. Catching fraud before payment is orders of magnitude more cost-effective than pay-and-chase recovery.
- Automate triage with risk thresholds. Set automated routing rules: claims with a combined risk score below 30 proceed normally, claims scoring 30 to 70 are flagged for enhanced review, and claims above 70 are held for Special Investigations Unit (SIU) review.
- Verify the full provider chain. Do not check the phone number in isolation. When a claim includes a provider phone, website, and facility address, verify all three. The correlation across channels is where the strongest fraud signals emerge.
- Maintain an audit trail. Store every ScamVerify™ API response alongside the claim record. The structured verification results (risk scores, verdicts, entity verifications, flags) provide documented evidence for claim denial decisions and regulatory compliance.
- Track patterns across claims. When multiple claims share the same provider phone number, website domain, or facility address, and any of those entities are flagged, investigate the entire cluster. Fraud rings often submit claims through multiple patient identities but reuse provider infrastructure.
- Re-verify on appeal. When a denied claim is appealed, run the verification again. Fraudsters sometimes update their infrastructure (register a new domain, switch phone numbers) between the initial submission and the appeal.
Related
- Document Analysis API Reference for document scanning request and response schemas
- Phone Lookup API Reference for provider phone verification
- URL Verification API Reference for provider website analysis
- Insurance Claims Fraud Detection for general insurance fraud patterns including email and batch analysis
Insurance Claims Fraud Detection
Use the ScamVerify™ API to detect fraud in insurance claims through phone number analysis, URL verification for submitted links, and email analysis for phishing in claims communications.
Mail and Document Fraud Prevention
Use the ScamVerify™ API to detect fake court notices, toll letters, IRS scams, and utility shutoff notices through AI-powered document analysis and entity verification.