Skip to main content
Policies in Anubis define how to identify and handle different types of traffic. They consist of bot rules (pattern-based detection) and thresholds (weight-based triggers).

Policy Structure

Anubis policies are configured in YAML and loaded at startup:
bots:
  - name: verified-googlebot
    remote_addresses:
      - "66.249.64.0/19"
    action: ALLOW

  - name: suspicious-user-agents
    user_agent_regex: "(curl|wget|scrapy)"
    action: CHALLENGE
    challenge:
      algorithm: fast
      difficulty: 3

thresholds:
  - name: high-suspicion
    expression: "weight >= 10"
    action: CHALLENGE
    challenge:
      algorithm: fast
      difficulty: 4

Bot Rules

Bot rules are evaluated sequentially. The first matching rule with a terminal action (ALLOW, DENY, CHALLENGE, BENCHMARK) determines the request’s fate.

Rule Definition

// From lib/config/config.go:58
type BotConfig struct {
    UserAgentRegex *string           `json:"user_agent_regex,omitempty"`
    PathRegex      *string           `json:"path_regex,omitempty"`
    HeadersRegex   map[string]string `json:"headers_regex,omitempty"`
    Expression     *ExpressionOrList `json:"expression,omitempty"`
    Challenge      *ChallengeRules   `json:"challenge,omitempty"`
    Weight         *Weight           `json:"weight,omitempty"`
    GeoIP          *GeoIP            `json:"geoip,omitempty"`
    ASNs           *ASNs             `json:"asns,omitempty"`
    Name           string            `json:"name"`
    Action         Rule              `json:"action"`
    RemoteAddr     []string          `json:"remote_addresses,omitempty"`
}

Matching Conditions

Rules can match requests using multiple conditions (AND logic):
Match against the User-Agent header:
- name: block-python-scrapers
  user_agent_regex: "python-requests|httpx|aiohttp"
  action: DENY
Implementation: lib/policy/checker.go:NewUserAgentChecker()
Match against the request path:
- name: protect-admin
  path_regex: "^/admin/.*"
  action: CHALLENGE
  challenge:
    difficulty: 5
Implementation: lib/policy/checker.go:NewPathChecker()
Match multiple headers with different patterns:
- name: api-key-check
  headers_regex:
    X-API-Key: "^key-[a-f0-9]{32}$"
    Accept: "application/json"
  action: ALLOW
Use .* to check if a header exists:
headers_regex:
  X-Custom-Header: ".*"  # Just check presence
Implementation: lib/policy/checker.go:NewHeadersChecker()
Match against client IP addresses:
- name: internal-network
  remote_addresses:
    - "10.0.0.0/8"
    - "172.16.0.0/12"
    - "192.168.0.0/16"
  action: ALLOW
Implementation: lib/policy/checker.go:NewRemoteAddrChecker() using gaissmai/bart prefix table
Advanced matching with Common Expression Language:
- name: rate-limit-trigger
  expression:
    - "req.headers['x-forwarded-for'].size() > 0"
    - "req.path.startsWith('/api/')"
    - "req.method in ['POST', 'PUT', 'DELETE']"
  action: WEIGH
  weight:
    adjust: 5
Available variables:
  • req.method (string)
  • req.path (string)
  • req.headers (map)
  • req.query (map)
  • DNS lookups (via expressions)
Implementation: lib/policy/celchecker.go:NewCELChecker()
Match by country code (requires Thoth):
- name: block-regions
  geoip:
    countries:
      - CN
      - RU
  action: DENY
Requires: Thoth service configured via ANUBIS_THOTH_URLImplementation: lib/thoth/geoipchecker.go
Match by Autonomous System Number:
- name: cloud-providers
  asns:
    match:
      - 16509  # Amazon AWS
      - 15169  # Google Cloud
      - 8075   # Microsoft Azure
  action: CHALLENGE
  challenge:
    difficulty: 2
Requires: Thoth serviceImplementation: lib/thoth/asnchecker.go
All conditions within a single bot rule are AND-ed together. If you specify both user_agent_regex and path_regex, the request must match both.

Rule Validation

Rules are validated on startup:
// From lib/config/config.go:95
func (b *BotConfig) Valid() error {
    var errs []error
    
    if b.Name == "" {
        errs = append(errs, ErrBotMustHaveName)
    }
    
    // Must have at least one matching condition
    allFieldsEmpty := b.UserAgentRegex == nil &&
        b.PathRegex == nil &&
        len(b.RemoteAddr) == 0 &&
        len(b.HeadersRegex) == 0 &&
        b.ASNs == nil &&
        b.GeoIP == nil
    
    if allFieldsEmpty && b.Expression == nil {
        errs = append(errs, ErrBotMustHaveUserAgentOrPath)
    }
    
    // Validate regexes compile
    if b.UserAgentRegex != nil {
        if _, err := regexp.Compile(*b.UserAgentRegex); err != nil {
            errs = append(errs, ErrInvalidUserAgentRegex, err)
        }
    }
    
    return errors.Join(errs...)
}

Actions

Bot rules can specify five different actions:
ALLOW
action
Immediately proxy the request to upstream without challenge.
- name: verified-bot
  remote_addresses:
    - "66.249.64.0/19"
  action: ALLOW
DENY
action
Block the request with a 403 Forbidden response.
- name: blacklisted-ips
  remote_addresses:
    - "203.0.113.0/24"
  action: DENY
CHALLENGE
action
Issue a proof-of-work challenge. Requires challenge configuration.
- name: suspicious-bot
  user_agent_regex: "bot|crawler|spider"
  action: CHALLENGE
  challenge:
    algorithm: fast
    difficulty: 3
WEIGH
action
Adjust the request’s suspicion weight and continue evaluation.
- name: missing-common-headers
  expression:
    - "!has(req.headers['accept-language'])"
  action: WEIGH
  weight:
    adjust: 5
Default weight adjustment is 5 if not specified.
DEBUG_BENCHMARK
action
Render a benchmark page for testing challenge performance.
- name: benchmark-endpoint
  path_regex: "^/__benchmark$"
  action: DEBUG_BENCHMARK

Action Flow

// From lib/anubis.go:609
for _, b := range s.policy.Bots {
    match, err := b.Rules.Check(r)
    if match {
        switch b.Action {
        case config.RuleDeny, config.RuleAllow, 
             config.RuleBenchmark, config.RuleChallenge:
            // Terminal action - return immediately
            return cr("bot/"+b.Name, b.Action, weight), &b, nil
        case config.RuleWeigh:
            // Non-terminal - accumulate weight and continue
            weight += b.Weight.Adjust
        }
    }
}
Order matters! Place ALLOW rules for verified bots first, then WEIGH rules to accumulate suspicion, and finally DENY/CHALLENGE rules.

Thresholds

Thresholds evaluate accumulated weight from WEIGH actions using CEL expressions:
thresholds:
  - name: low-suspicion
    expression: "weight >= 5 && weight < 10"
    action: CHALLENGE
    challenge:
      algorithm: fast
      difficulty: 2

  - name: high-suspicion
    expression: "weight >= 10"
    action: CHALLENGE
    challenge:
      algorithm: fast
      difficulty: 5

  - name: extreme-suspicion
    expression: "weight >= 20"
    action: DENY

Threshold Evaluation

// From lib/anubis.go:627
for _, t := range s.policy.Thresholds {
    result, _, err := t.Program.ContextEval(
        r.Context(), 
        &policy.ThresholdRequest{Weight: weight}
    )
    
    if matches {
        return cr("threshold/"+t.Name, t.Action, weight), &policy.Bot{
            Challenge: t.Challenge,
            Rules:     &checker.List{},
        }, nil
    }
}
Thresholds are evaluated in order. The first matching threshold determines the action.

Threshold Definition

// From lib/config/threshold.go:32
type Threshold struct {
    Expression *ExpressionOrList `json:"expression"`
    Challenge  *ChallengeRules   `json:"challenge"`
    Name       string            `json:"name"`
    Action     Rule              `json:"action"`
}
Thresholds cannot use the WEIGH action - this validation error occurs at config load time:
if t.Action == RuleWeigh {
    errs = append(errs, ErrThresholdCannotHaveWeighAction)
}

Rule Hashing

Each bot rule is hashed to detect policy changes:
// From lib/policy/bot.go:19
func (b Bot) Hash() string {
    return internal.FastHash(fmt.Sprintf("%s::%s", b.Name, b.Rules.Hash()))
}
This hash is embedded in JWTs. When you update your policy:
  1. Rule hash changes
  2. Existing JWTs with old hash fail validation
  3. Clients must re-solve challenges
This prevents bypassing updated security rules with old tokens.

Check Result

Policy evaluation returns a CheckResult:
// From lib/policy/checkresult.go:9
type CheckResult struct {
    Name   string       // e.g., "bot/suspicious-crawler" or "threshold/high-suspicion"
    Rule   config.Rule  // ALLOW, DENY, CHALLENGE, etc.
    Weight int          // Accumulated weight
}
This is logged and exposed in Prometheus metrics:
anubis_policy_results{rule="bot/verified-googlebot",action="ALLOW"} 1234
anubis_policy_results{rule="threshold/high-suspicion",action="CHALLENGE"} 567

Import Statements

Reuse bot rules across multiple configs:
# main-policy.yaml
bots:
  - import: "(data)/verified-bots.yaml"  # Built-in
  - import: "/etc/anubis/custom-rules.yaml"  # External
  
  - name: site-specific-rule
    path_regex: "^/protected/"
    action: CHALLENGE
# verified-bots.yaml
- name: googlebot
  remote_addresses:
    - "66.249.64.0/19"
  action: ALLOW

- name: bingbot
  remote_addresses:
    - "40.77.167.0/24"
  action: ALLOW
Use the (data)/ prefix to import built-in bot policies shipped with Anubis. These are embedded at compile time.

CEL Expressions

Anubis supports Common Expression Language for advanced matching:

Available Functions

req.path.startsWith('/api/')
req.headers['user-agent'].contains('Mobile')
req.method.matches('^(POST|PUT|DELETE)$')

Environment Variables

Expressions have access to:
// From lib/policy/expressions/
- req.method (string)
- req.path (string)
- req.headers (map<string, string>)
- req.query (map<string, string>)
- loadavg() (float, Linux only)
- dns.forward(ip) ([]string)
- dns.reverse(hostname) ([]string)

Example Policies

# Escalate difficulty based on behavior
bots:
  # Known good bots
  - name: verified-crawlers
    import: "(data)/verified-bots.yaml"
  
  # Add suspicion for missing headers
  - name: missing-language
    expression:
      - "!has(req.headers['accept-language'])"
    action: WEIGH
    weight:
      adjust: 3
  
  - name: missing-encoding
    expression:
      - "!has(req.headers['accept-encoding'])"
    action: WEIGH
    weight:
      adjust: 3
  
  # Add suspicion for scraper user agents
  - name: scraper-ua
    user_agent_regex: "(curl|wget|python|scrapy)"
    action: WEIGH
    weight:
      adjust: 10

thresholds:
  # Light challenge for moderate suspicion
  - name: moderate
    expression: "weight >= 5 && weight < 10"
    action: CHALLENGE
    challenge:
      algorithm: fast
      difficulty: 2
  
  # Heavy challenge for high suspicion
  - name: high
    expression: "weight >= 10"
    action: CHALLENGE
    challenge:
      algorithm: fast
      difficulty: 4

Default Behavior

If no bot rules or thresholds match, Anubis allows the request:
// From lib/anubis.go:648
return cr("default/allow", config.RuleAllow, weight), &policy.Bot{
    Challenge: &config.ChallengeRules{
        Difficulty: s.policy.DefaultDifficulty,
        Algorithm:  config.DefaultAlgorithm,
    },
    Rules: &checker.List{},
}, nil
This “default allow” behavior means Anubis is not a firewall by itself. It only challenges/blocks traffic that matches your rules. Combine it with proper network security.

Metrics and Monitoring

Policy decisions are tracked:
# Rule application counts
anubis_policy_results{rule="bot/verified-googlebot",action="ALLOW"}
anubis_policy_results{rule="bot/suspicious-crawler",action="CHALLENGE"}
anubis_policy_results{rule="threshold/high-suspicion",action="CHALLENGE"}
anubis_policy_results{rule="bot/blocklist",action="DENY"}
Request headers include policy metadata:
X-Anubis-Rule: bot/suspicious-crawler
X-Anubis-Action: CHALLENGE
X-Anubis-Status: PASS

Best Practices

Order Rules Carefully

Place ALLOW rules first, then WEIGH, then terminal actions.

Use Imports

Reuse verified bot lists with import: "(data)/verified-bots.yaml".

Start Conservative

Begin with WEIGH actions and observe metrics before adding DENY rules.

Test Expressions

Use DEBUG_BENCHMARK action on test endpoints to verify CEL expressions.

Next Steps

Challenges

Configure challenge difficulty and algorithms

Architecture

Understand how policies integrate with the proxy

Build docs developers (and LLMs) love