Methodology
Where every signal comes from.
GeoQ doesn't own a secret dataset. We derive, refresh and classify public and open-source inputs on a stated cadence, attach an evidence label to each one, and publish the exact risk weights below. This page is the whole method — sources, cadence, weights — so you can check our work, not take it on faith.
A signal is one input, not a verdict. Cadence is a refresh schedule, not a claim that data is current to the second.
Datasets, sources & refresh cadence
| Dataset | Fields set | Source | Cadence | Evidence |
|---|---|---|---|---|
| IP geolocation Geolocation is an estimate, not a GPS fix. Accuracy is highest at country level and drops toward city level. | geo.country, geo.region, geo.city, geo.latitude, geo.longitude, geo.timezone | DB-IP (CC BY 4.0) | Monthly (DB-IP free release) | inferred |
| Datacenter / cloud ranges Providers publish their own ranges. We classify the IP against them; we never guess residential or mobile. | connection_type === "datacenter", datacenter_provider | AWS, Google Cloud, Microsoft Azure published IP ranges | Daily | |
| Satellite ranges Satellite is a connection_type value, not a boolean. Satellite ASNs can carry mixed traffic — read the limit on the detect page. | connection_type === "satellite" | Operator-published / BGP-derived ASN ranges (e.g. Starlink) | Daily | |
| Tor exit list The exit list is published by the Tor Project. Membership is a fact, not an inference. | is_tor | Tor Project public exit list | Hourly | |
| VPN ranges Commercial-VPN ranges shift. We treat this as an inference, not ground truth. | is_vpn | Self-maintained list of known commercial-VPN ranges | Daily | inferred |
| Proxy ranges (beta) Residential-proxy detection is beta — weight it accordingly. Spur leads this category; we do not claim parity. | is_proxy | Open / anonymising-proxy lists; residential-proxy detection is beta | Daily | beta |
| Relay ranges Apple publishes these so operators can recognise relay traffic. A benign network kind — it caps the score (see weights). | is_relay, relay_provider: "icloud" | Apple's published iCloud Private Relay egress ranges | Daily | |
| Public-resolver ranges A benign network kind. Recognising it stops public DNS resolvers scoring as fraud. | is_public_resolver | Published resolver ranges (e.g. 8.8.8.8, 1.1.1.1) | Daily | |
| Spamhaus DROP We retain only the current published ranges and refresh on update; we do not redistribute the lists. | is_drop_listed | The Spamhaus Project DROP lists | Daily (on each DROP update) | |
| Routing health (BGP) Derived from public BGP tables. is_announced means a covering prefix is visible in the global routing table. | is_announced, is_bogon | RouteViews + RIPE NCC RIS public BGP data; bogon constants | Daily (BGP); monthly (bogon constants) | |
| RPKI validation We run our own validator against the published trust anchors. Only the invalid state scores. | rpki ("valid" | "invalid" | "unknown") | RPKI repositories, validated with a self-run Routinator | Daily | |
| RIR allocation Derived from the RIRs' published delegated-statistics files. | allocation_date, allocation_age_days, registration_country | ARIN, RIPE NCC, APNIC, LACNIC, AFRINIC delegated statistics | Weekly | |
| Recent abuse (beta) Beta and demand-gated — it scores zero today. Surfaced as a signal you can read, not yet a contributor. | recent_abuse | Emerging Threats open lists; CINS Army list | Daily | beta |
| Verified crawlers Identifies a good crawler you must not block. It carries zero risk weight — it is not bad-bot detection. | is_verified_bot, verified_bot_name | Operator-published crawler ranges (Googlebot, Bingbot) | Daily |
Full licences and credits are on the attributions page. We use free and open-source inputs only — no paid data licences.
What the evidence labels mean
| Label | Meaning |
|---|---|
| Sourced from a list the network or registry publishes about itself (Apple relay ranges, RIR allocations, the Tor exit list). Membership is a fact. | |
| inferred | Derived from lists that shift over time (commercial-VPN ranges, geolocation). Treat as an estimate, not ground truth. |
| beta | Surfaced so you can read it, but not yet trusted enough to score. Weight it yourself; we don't. |
Every response carries an evidence object with one of these
labels per signal, so you can decide how much to trust each one. See the
response schema, and the
glossary for the terms behind each signal.
Published risk weights
The risk score is min(100, Σ weights) of the signals that fired —
no machine-learning black box. After the sum, a benign network kind
(relay, satellite or public resolver) caps the score at 20. Full worked
examples and the reproducible code are on the
risk-score methodology page.
| Signal | Weight | Why |
|---|---|---|
is_tor | +45 | Tor exit node — strong anonymisation |
is_proxy | +40 | Open/anonymising proxy (residential-proxy detection in beta) |
is_drop_listed | +40 | IP is on the Spamhaus DROP list (do-not-route, known hostile) |
connection_type=="datacenter" | +35 | Hosting / cloud range, not a residential ISP |
is_bogon | +30 | Bogon — unallocated or reserved space that should never source traffic |
is_vpn | +30 | Known commercial VPN range |
rpki=="invalid" | +20 | Route origin fails RPKI validation (only "invalid" scores) |
Suppressor: if is_relay,
connection_type === "satellite" or is_public_resolver
is true, the score is capped at 20 and benign_network_kind is
added to reasons[]. It's a cap, not a negative weight.
recent_abuse is beta and weighted zero today;
is_verified_bot identifies a good crawler and carries no weight.
How we keep it honest
Refreshed on a cadence you can see
Each dataset above has a stated refresh schedule. We don't say "real-time" or "always up to date" — we tell you how often we pull, and you can plan around it.
Derived in the open
Every score ships with reasons[] and per-signal evidence. The inputs are public; the weights are published here; the formula is reproducible.
Built to reduce false positives
Relay, satellite and public-resolver IPs are recognised and capped at 20 — so Apple and Starlink users don't get scored like a hostile datacenter. See the false-positive guide.
Fails closed, never empty
Hand-curated lists carry a coverage canary and a staleness check. A degraded or empty dataset fails the build rather than silently shipping bad data.
Start with the free tier. No card.
5,000 lookups a day, every signal, the same transparent risk score. Upgrade only when you outgrow it.