Thoth IP Reputation Service
Thoth is the reputation database for Anubis. Thoth feeds information to Anubis so that it can make better decisions about which traffic is innocuous and which traffic is suspicious.Thoth is hosted by Techaro and is a paid service. Thoth is opt-in and requires manual intervention (including payment) to use. The code that powers Thoth is currently closed source.To get access to Thoth, please subscribe on GitHub Sponsors and email Xe. This will be self-service soon.
What is Thoth?
Anubis instances are normally isolated. Each Anubis instance has its own configuration and exists in roughly its own world without any long term memory between requests. As threats, workarounds, and AI scraper toolchains evolve, administrators need a way to get more up-to-date information faster than Anubis’ release cycle. Thoth solves this problem by providing:- Real-time threat intelligence: Get up-to-date information about malicious actors faster than Anubis’ release cycle
- Shared reputation data: Benefit from collective threat intelligence across Anubis deployments
- ASN and GeoIP filtering: Make decisions based on autonomous system numbers and geographic locations
- Informative, not authoritative: Thoth influences request weight but doesn’t arbitrarily block traffic
Implementation
Thoth is a web service that listens over gRPC. Thoth’s API is documented in protocol buffer definitions in the GitHub repo TecharoHQ/thoth-proto. Thoth is designed to be informative, not authoritative. Thoth cannot and will not arbitrarily block requests, origins, or other traffic. Thoth is there to inform Anubis and influence the weight of requests so that upstream resources can be protected. Additionally, Anubis aggressively caches data from Thoth such that over time Anubis will not need to request data very often. This makes the fast path for repeat visitors even faster and reduces the amount of data that Thoth is exposed to.Configuration
To enable Thoth integration, configure the following environment variables:THOTH_URL
The URL for your Thoth instance.THOTH_TOKEN
Your API token for authenticating with Thoth.Example Configuration
- Uses TLS for secure communication (unless
THOTH_INSECUREis set) - Sets a 500ms timeout for Thoth requests
- Includes the Anubis version in the User-Agent header
- Provides Prometheus metrics for monitoring gRPC performance
Features
ASN-based Filtering
When companies link their backbone infrastructure to the Internet, they do so via a BGP Autonomous System, denoted by a number (the Autonomous System Number or ASN). Every IP address on the Internet is owned by an ASN with a 1:1 lookup that does not change very frequently. Anubis uses Thoth to match IP addresses to BGP Autonomous Systems so that you can either issue arbitrary challenges to individual internet service providers (such as Cloudflare or Huawei Cloud) or, at the administrator’s explicit instruction, block them altogether. Example: Add 10 weight points to requests from Cloudflare, Huawei Cloud, and Alibaba Cloud:GeoIP-based Filtering
In extreme cases, an administrator may have to take action against an entire country. This is not an ideal circumstance, but sometimes reality forces their hands and the administrators just want to sleep at night. Anubis uses Thoth to look up the geographic location registered to an IP address. This lookup is not the best and will get better with time, but you ship what you can so you can make it better for next time. Example: Add 10 weight points to requests from Brazil and China:Work-in-Progress Features
The following features are planned for future releases:- Private rulesets: Advanced patterns, current known exploits, and other recognition tactics that need to be kept confidential for operational security reasons
- Private challenge implementations: Advanced browser detection logic via WebAssembly
- Reputation querying: Arbitrarily influence request weight based on net aggregate pass rate so common browsers can get through with no challenge
- Abuse reporting APIs: Allow trusted administrators to report abusive request fingerprints for faster threat response
- Pass rate reporting: Periodic reporting of pass rates per ASN and other fingerprints for methodology improvement
Benefits
- Faster threat response: React to new threats faster than Anubis’ release cycle
- Reduced false positives: Better intelligence leads to more accurate traffic classification
- Performance optimization: Aggressive caching means minimal latency impact
- Collective intelligence: Benefit from threat data across multiple Anubis deployments
- Fine-grained control: Use ASN and GeoIP data to create sophisticated filtering rules