ScamVerify™
Concepts

Data Sources

All threat intelligence and reputation data sources used by ScamVerify™, including what each provides, update frequency, and coverage.

ScamVerify™ aggregates data from 12+ independent sources to produce risk assessments. No single source is trusted in isolation. The scoring engine cross-references multiple signals to reduce false positives and catch threats that any individual source might miss.

FTC Do Not Call Complaints

What it provides: Consumer complaint records filed with the Federal Trade Commission against phone numbers that violated the National Do Not Call Registry. Each record includes the complaint date, subject category (robocall, live caller, etc.), and whether the consumer reported a robocall.

Coverage: 2.79 million+ complaint records covering U.S. phone numbers.

Update frequency: Hourly sync via automated pipeline. New complaints are typically available within 1 to 2 hours of FTC publication.

Used by: Phone channel. Complaint count, recency, and robocall percentage are primary inputs to the phone scoring engine.

FCC Consumer Complaints

What it provides: Telecom-specific consumer complaints filed with the Federal Communications Commission. These cover a broader range of issues than FTC data, including unwanted calls, cramming, slamming, and accessibility violations.

Coverage: U.S. telecom complaints. Overlaps with FTC data but provides independent corroboration and catches complaints not filed with the FTC.

Update frequency: Synced via automated pipeline on a regular schedule.

Used by: Phone channel. FCC complaint counts contribute to the base score independently of FTC data.

Twilio Lookup

What it provides: Real-time telecom infrastructure data for phone numbers, including:

  • Carrier name (e.g., T-Mobile, Bandwidth.com)
  • Line type (mobile, landline, VoIP, non-fixed VoIP, toll-free)
  • CNAM (Caller Name, the registered caller ID)
  • Number validity (whether the number exists in the telecom network)

Coverage: U.S. phone numbers. International coverage varies by region.

Update frequency: Real-time lookup on each request (cached after first lookup).

Used by: Phone channel. Line type and carrier information are critical scoring signals. VoIP numbers, especially non-fixed VoIP, carry higher baseline risk. Invalid numbers are automatically scored at 100.

Robocall Detection Database

What it provides: Real-time identification of phone numbers associated with robocall activity. Numbers are flagged based on call pattern analysis across carrier networks.

Coverage: U.S. phone numbers with active robocall patterns.

Update frequency: Real-time on each lookup.

Used by: Phone channel. A robocall flag enforces a minimum score floor of 65, ensuring flagged numbers are never rated below high_risk.

IPQS (IP Quality Score)

What it provides: Reputation scores for both phone numbers and URLs. The phone reputation score evaluates fraud risk based on patterns across financial services, e-commerce, and telecom. The URL reputation score evaluates domain risk based on hosting infrastructure, traffic patterns, and known associations with malicious activity.

Coverage: Global. Covers phone numbers and URLs/domains.

Update frequency: Real-time on each lookup.

Used by: Phone channel (phone reputation) and URL channel (URL/domain reputation). IPQS scores feed into the rules engine as one of many signals. The IPQS score is included in the signals object as ipqs_risk_score for URL lookups.

URLhaus (abuse.ch)

What it provides: A database of URLs and domains associated with malware distribution. Maintained by abuse.ch, a Swiss security research project. Entries are contributed by security researchers worldwide and verified before inclusion.

Coverage: 2,374+ malware-distributing domains. Global coverage with a focus on actively distributing threats.

Update frequency: Automated sync via scheduled pipeline. New entries are typically available within hours of publication.

Used by: URL channel. A URLhaus listing is a high-confidence signal that the domain is distributing malware. Indicated in the signals object as urlhaus_listed: true.

ThreatFox (abuse.ch)

What it provides: An Indicator of Compromise (IOC) database covering domains, IPs, and URLs associated with malware command-and-control servers, phishing infrastructure, and botnet activity. Broader than URLhaus, covering threat infrastructure beyond just malware distribution URLs.

Coverage: 54,377+ domains. Global coverage across multiple threat categories.

Update frequency: Automated sync via scheduled pipeline.

Used by: URL channel. A ThreatFox listing indicates the domain is associated with malicious infrastructure. Indicated in the signals object as threatfox_listed: true.

Google Web Risk

What it provides: Google's classification of URLs into threat categories: phishing, malware, social engineering, and unwanted software. This is the same technology that powers Safe Browsing warnings in Chrome.

Coverage: Global. Covers billions of URLs based on Google's web crawling infrastructure.

Update frequency: Real-time on each lookup.

Used by: URL channel. A Google Web Risk flag is a high-confidence signal. The specific threat type (e.g., SOCIAL_ENGINEERING, MALWARE) is included in the signals object as google_web_risk.

WHOIS / RDAP

What it provides: Domain registration data, including:

  • Registration date (used to calculate domain age)
  • Expiration date
  • Registrar name (e.g., GoDaddy, Namecheap)
  • Registrant information (when not redacted by privacy services)

Coverage: All registered domains with public WHOIS/RDAP records.

Update frequency: Real-time on each lookup (cached after first lookup).

Used by: URL channel and email channel (sender domain analysis). Domain age is a significant scoring factor. Domains registered in the last 30 days are inherently riskier. The signals object includes domain_age_days and registrar.

SSL Certificate Analysis

What it provides: SSL/TLS certificate details for the target domain, including:

  • Certificate issuer (e.g., Let's Encrypt, DigiCert)
  • Validation type (DV, OV, EV)
  • Certificate age (when it was issued)
  • Expiration status

Coverage: Any domain serving HTTPS traffic.

Update frequency: Real-time on each lookup.

Used by: URL channel. Recently issued certificates on new domains are a risk signal. Missing or expired certificates contribute to the score. The signals object includes ssl_issuer and ssl_age_days.

Brand Impersonation Detection

What it provides: Detection of domains that mimic well-known brands. The system checks for typosquatting (e.g., arnazon.com), lookalike domains, and keyword-based impersonation targeting banks, tech companies, government agencies, and other commonly impersonated entities.

Coverage: Proprietary brand database covering major financial institutions, tech companies, government agencies, and popular consumer brands.

Update frequency: Brand database is maintained and updated regularly.

Used by: URL channel and email channel (sender domain analysis). Detected impersonation is included in the signals object with the matched brand name and confidence level.

Community Reports

What it provides: User-submitted reports indicating whether a phone number or URL is a scam or legitimate. Reports include a classification (scam, legitimate, robocall, telemarketer, debt collector, wrong number) and optional comments.

Coverage: Grows over time as users contribute. Coverage is strongest for numbers and URLs that receive high search volume.

Update frequency: Real-time. New reports immediately influence scoring.

Used by: Phone and URL channels. Community consensus can shift verdicts in either direction. After 3+ consistent reports from different users, community data begins to significantly influence the final score. After 10+ consistent reports, community consensus can override AI analysis.

High-Risk VoIP Carriers

What it provides: A proprietary list of 18 VoIP carriers that are disproportionately associated with scam calls. These carriers provide cheap, disposable phone numbers that are favored by fraudsters.

Coverage: 18 identified carriers. The list is maintained based on FTC complaint patterns, carrier abuse reports, and industry intelligence.

Update frequency: Updated as new high-risk carriers are identified.

Used by: Phone channel. Numbers on these carriers receive a score boost and, when combined with no caller ID, enforce a minimum score floor.

Source Availability in API Responses

Every API response includes a sources_checked array that tells you exactly which data sources contributed to the assessment. This is useful for understanding the depth of analysis and for handling cases where certain sources were unavailable.

{
  "sources_checked": [
    "ftc",
    "fcc",
    "twilio",
    "nomorobo",
    "community_reports",
    "ai_analysis"
  ]
}

For URL lookups:

{
  "sources_checked": [
    "google_web_risk",
    "rdap",
    "ssl",
    "redirects",
    "brand_detection",
    "urlhaus",
    "threatfox",
    "ipqs",
    "ai_analysis"
  ]
}

If a source is missing from sources_checked, it means either the source did not return data for that specific input or the source was temporarily unavailable. The scoring engine handles missing sources gracefully by relying on the remaining signals.

On this page