Bot Protection & WAF Rules — Country, ASN, and VPN Signals
Cut malicious traffic 60–90% with three IP signals you probably aren’t using yet.
Most bot traffic is noisy, cheap, and distinguishable at the network layer before it hits your app. Rate-limiting is downstream; IP-layer filtering is upstream. Combine country, ASN (hosting-provider), and VPN/Tor signals into a single WAF rule and your app tier gets materially quieter.
The business problem
Bots cost money three ways:
- Compute — credential-stuffing, scraping, and LLM-training crawlers drown legit traffic.
- Noise — fake signups, form spam, and content flooding clog analytics + moderation queues.
- Abuse — promo-code abuse, inventory-hoarding, review manipulation, SMS-pump fraud.
Meanwhile, almost all of this traffic shares a pattern: it originates from hosting ASNs (AWS, GCP, Azure, DigitalOcean, Hetzner) or from commercial VPN exits. Real humans almost never shop, sign up, or browse your docs from i-0abc123.ec2.internal.
Implementation
Cloudflare Workers (edge)
export default {
async fetch(req, env) {
const ip = req.headers.get("CF-Connecting-IP");
const geo = await fetch(`https://api.ipgeo.10b.app/v1/lookup/${ip}`, {
headers: { Authorization: `Bearer ${env.IPGEO_KEY}` }
}).then(r => r.json());
// Block if: data-center origin on a non-API path
if (geo.is_hosting && !req.url.includes("/api/")) {
return new Response("Blocked", { status: 403 });
}
// CAPTCHA if: VPN + sensitive path (signup, login, checkout)
if ((geo.is_vpn || geo.is_tor) && /\/(signup|login|checkout)/.test(req.url)) {
return Response.redirect("/challenge?reason=vpn", 302);
}
return fetch(req);
}
}
NGINX + Lua (self-hosted WAF)
access_by_lua_block {
local ip = ngx.var.remote_addr
local resty_http = require "resty.http"
local httpc = resty_http.new()
local res, err = httpc:request_uri("https://api.ipgeo.10b.app/v1/lookup/" .. ip, {
method = "GET",
headers = { ["Authorization"] = "Bearer " .. os.getenv("IPGEO_KEY") },
ssl_verify = true,
})
if not res then return end
local geo = cjson.decode(res.body)
if geo.is_tor or (geo.is_hosting and geo.asn ~= "AS13335") then
return ngx.exit(403)
end
ngx.ctx.geo = geo
}
Rule library (WAF patterns worth stealing)
| Rule | Action | Rationale |
|---|---|---|
is_tor == true |
Block or CAPTCHA | Tor exit-nodes are public + low-trust for transactional flows |
is_hosting == true AND path != /api/* |
Block | No human browses marketing pages from EC2 |
country_code IN [sanctioned] |
451 response | See ./geoblocking-compliance.md |
asn IN [known-abuser list] |
Challenge | Maintain a small list (e.g. AS14061 DigitalOcean residential abuse) |
is_vpn == true AND sensitive_path |
CAPTCHA | Allow privacy-conscious users but add friction on signup/pay |
| > 50 requests/min from same /24 subnet | Rate-limit | Covers proxy-pool botnets |
Why IP Geo API for this use case
- Three threat-intel fields in the base response (
is_vpn,is_proxy,is_tor) — no add-on SKU. - Hosting / data-center flag (
is_hosting) — distinguishes cloud IPs from residential ISPs with ASN-level accuracy. - ASN + organisation name — lets you maintain short “always-block” and “always-allow” lists (e.g. Googlebot →
AS15169→ allow; an abuse-heavy VPS provider → block). - Bulk lookup — for log-analysis pipelines, one call returns up to 100 IPs.
- Edge-friendly latency — median ≤ 40 ms EU, ≤ 80 ms US. Fits inside Cloudflare Workers, Fastly Compute, Vercel Edge.
Pricing math
Most WAFs cache lookups for 5–60 minutes (IP → decision). A site with 10 M page-views/mo typically does 50–200 K unique-IP lookups per month.
| Unique IPs/mo | Tier | Cost |
|---|---|---|
| < 30 K | Free | € 0 |
| < 1 M | Starter | € 29 |
| < 10 M | Business | € 99 |
At € 29/mo, you’re paying roughly 1 cloud-VM-hour per 1 million requests protected. It pays for itself the first time it blocks a single credential-stuffing run.
Honest trade-offs
- Residential proxies evade
is_proxy. Determined attackers rent residential proxy pools ($50–500/mo). These show up as normal ISP IPs with normal ASNs. If you’re targeted (not just drive-by scraped), add device fingerprinting or behavioral signals. - Corporate VPNs look like VPNs. Your B2B customers on a work VPN will trip
is_vpn. Never block onis_vpnalone — only combine with sensitive-path logic or use it for challenge, not block. - Googlebot and friends are on hosting ASNs. Maintain an allowlist for
AS15169(Google),AS8075(Microsoft),AS14618(Amazon bots),AS13238(Yandex),AS32934(Facebook). Our API exposes the ASN; you decide the rule.
Related use cases
- Fraud detection —
./fraud-detection.md - Geoblocking / compliance —
./geoblocking-compliance.md - Visitor analytics —
./visitor-analytics.md
Get started
Free tier: 1 000 lookups / day → /pricing. Sign up at https://ipgeo.10b.app/pricing.
Get early access — 50% off for 12 months
First 100 signups lock in 50% off any paid plan for the first year. No credit card required — we’ll email you at launch.