Bot Protection & WAF Rules — Country, ASN, and VPN Signals

Cut malicious traffic 60–90% with three IP signals you probably aren’t using yet.

Most bot traffic is noisy, cheap, and distinguishable at the network layer before it hits your app. Rate-limiting is downstream; IP-layer filtering is upstream. Combine country, ASN (hosting-provider), and VPN/Tor signals into a single WAF rule and your app tier gets materially quieter.

The business problem

Bots cost money three ways:

Compute — credential-stuffing, scraping, and LLM-training crawlers drown legit traffic.
Noise — fake signups, form spam, and content flooding clog analytics + moderation queues.
Abuse — promo-code abuse, inventory-hoarding, review manipulation, SMS-pump fraud.

Meanwhile, almost all of this traffic shares a pattern: it originates from hosting ASNs (AWS, GCP, Azure, DigitalOcean, Hetzner) or from commercial VPN exits. Real humans almost never shop, sign up, or browse your docs from i-0abc123.ec2.internal.

Implementation

Cloudflare Workers (edge)

export default {
  async fetch(req, env) {
    const ip = req.headers.get("CF-Connecting-IP");
    const geo = await fetch(`https://api.ipgeo.10b.app/v1/lookup/${ip}`, {
      headers: { Authorization: `Bearer ${env.IPGEO_KEY}` }
    }).then(r => r.json());

    // Block if: data-center origin on a non-API path
    if (geo.is_hosting && !req.url.includes("/api/")) {
      return new Response("Blocked", { status: 403 });
    }

    // CAPTCHA if: VPN + sensitive path (signup, login, checkout)
    if ((geo.is_vpn || geo.is_tor) && /\/(signup|login|checkout)/.test(req.url)) {
      return Response.redirect("/challenge?reason=vpn", 302);
    }

    return fetch(req);
  }
}

NGINX + Lua (self-hosted WAF)

access_by_lua_block {
  local ip = ngx.var.remote_addr
  local resty_http = require "resty.http"
  local httpc = resty_http.new()
  local res, err = httpc:request_uri("https://api.ipgeo.10b.app/v1/lookup/" .. ip, {
    method = "GET",
    headers = { ["Authorization"] = "Bearer " .. os.getenv("IPGEO_KEY") },
    ssl_verify = true,
  })
  if not res then return end
  local geo = cjson.decode(res.body)
  if geo.is_tor or (geo.is_hosting and geo.asn ~= "AS13335") then
    return ngx.exit(403)
  end
  ngx.ctx.geo = geo
}

Rule library (WAF patterns worth stealing)

Rule	Action	Rationale
`is_tor == true`	Block or CAPTCHA	Tor exit-nodes are public + low-trust for transactional flows
`is_hosting == true AND path != /api/*`	Block	No human browses marketing pages from EC2
`country_code IN [sanctioned]`	451 response	See `./geoblocking-compliance.md`
`asn IN [known-abuser list]`	Challenge	Maintain a small list (e.g. AS14061 DigitalOcean residential abuse)
`is_vpn == true AND sensitive_path`	CAPTCHA	Allow privacy-conscious users but add friction on signup/pay
> 50 requests/min from same /24 subnet	Rate-limit	Covers proxy-pool botnets

Why IP Geo API for this use case

Three threat-intel fields in the base response (is_vpn, is_proxy, is_tor) — no add-on SKU.
Hosting / data-center flag (is_hosting) — distinguishes cloud IPs from residential ISPs with ASN-level accuracy.
ASN + organisation name — lets you maintain short “always-block” and “always-allow” lists (e.g. Googlebot → AS15169 → allow; an abuse-heavy VPS provider → block).
Bulk lookup — for log-analysis pipelines, one call returns up to 100 IPs.
Edge-friendly latency — median ≤ 40 ms EU, ≤ 80 ms US. Fits inside Cloudflare Workers, Fastly Compute, Vercel Edge.

Pricing math

Most WAFs cache lookups for 5–60 minutes (IP → decision). A site with 10 M page-views/mo typically does 50–200 K unique-IP lookups per month.

Unique IPs/mo	Tier	Cost
< 30 K	Free	€ 0
< 1 M	Starter	€ 29
< 10 M	Business	€ 99

At € 29/mo, you’re paying roughly 1 cloud-VM-hour per 1 million requests protected. It pays for itself the first time it blocks a single credential-stuffing run.

Honest trade-offs

Residential proxies evade is_proxy. Determined attackers rent residential proxy pools ($50–500/mo). These show up as normal ISP IPs with normal ASNs. If you’re targeted (not just drive-by scraped), add device fingerprinting or behavioral signals.
Corporate VPNs look like VPNs. Your B2B customers on a work VPN will trip is_vpn. Never block on is_vpn alone — only combine with sensitive-path logic or use it for challenge, not block.
Googlebot and friends are on hosting ASNs. Maintain an allowlist for AS15169 (Google), AS8075 (Microsoft), AS14618 (Amazon bots), AS13238 (Yandex), AS32934 (Facebook). Our API exposes the ASN; you decide the rule.

Related use cases

Fraud detection — ./fraud-detection.md
Geoblocking / compliance — ./geoblocking-compliance.md
Visitor analytics — ./visitor-analytics.md

Get started

Free tier: 1 000 lookups / day → /pricing. Sign up at https://ipgeo.10b.app/pricing.

Get early access — 50% off for 12 months

First 100 signups lock in 50% off any paid plan for the first year. No credit card required — we’ll email you at launch.