Overview
Manifest uses a multi-dimensional scoring algorithm to analyze incoming LLM requests and route them to the most appropriate model tier. The system evaluates 23 distinct dimensions spanning keyword analysis, structural patterns, and contextual signals.Tier System
Requests are classified into four tiers based on complexity:Simple
Score: < -0.10
Basic questions, greetings, acknowledgments. Best for fast, cheap models.
Basic questions, greetings, acknowledgments. Best for fast, cheap models.
Standard
Score: -0.10 to 0.08
Code generation, tool use, standard queries. Mid-tier models.
Code generation, tool use, standard queries. Mid-tier models.
Complex
Score: 0.08 to 0.35
Large context, nested logic, multi-step workflows. Advanced models.
Large context, nested logic, multi-step workflows. Advanced models.
Reasoning
Score: > 0.35
Formal proofs, deep analysis, complex problem-solving. Frontier models.
Formal proofs, deep analysis, complex problem-solving. Frontier models.
Scoring Algorithm
23 Dimensions
The algorithm evaluates these dimensions (source:packages/backend/src/routing/scorer/config.ts:112-137):
Keyword Dimensions (14)
These dimensions scan message content for specific keyword patterns:| Dimension | Weight | Direction | Purpose |
|---|---|---|---|
| formalLogic | 0.07 | ↑ | Detects formal proofs, theorems, mathematical reasoning |
| analyticalReasoning | 0.06 | ↑ | Identifies comparison, evaluation, trade-off analysis |
| codeGeneration | 0.06 | ↑ | Recognizes code creation requests |
| codeReview | 0.05 | ↑ | Finds debugging, error analysis, optimization |
| technicalTerms | 0.07 | ↑ | Counts technical vocabulary (kubernetes, GraphQL, etc.) |
| simpleIndicators | 0.08 | ↓ | Detects basic queries (“what is”, “hello”, “thanks”) |
| multiStep | 0.07 | ↑ | Identifies sequential workflows (“first”, “then”, “step 1”) |
| creative | 0.03 | ↑ | Finds creative tasks (story, poem, brainstorm) |
| questionComplexity | 0.03 | ↑ | Detects complex question structures |
| imperativeVerbs | 0.02 | ↑ | Counts action verbs (build, deploy, configure) |
| outputFormat | 0.02 | ↑ | Recognizes format requests (JSON, YAML, table) |
| domainSpecificity | 0.05 | ↑ | Finds domain-specific terms (HIPAA, regression, genome) |
| agenticTasks | 0.03 | ↑ | Identifies agent-like operations (triage, orchestrate) |
| relay | 0.02 | ↓ | Detects simple forwarding (“just say”, “notify”) |
Direction: ↑ increases complexity score, ↓ decreases it. The
simpleIndicators dimension is the strongest downward signal (weight 0.08).Structural Dimensions (5)
Analyze message structure without keyword matching:| Dimension | Weight | Purpose |
|---|---|---|
| tokenCount | 0.05 | Longer messages suggest complexity (scoring.ts:12-19) |
| nestedListDepth | 0.03 | Multi-level lists indicate structured requirements (scoring.ts:21-35) |
| conditionalLogic | 0.03 | ”if/then”, “unless”, “depending on” patterns (scoring.ts:37-63) |
| codeToProse | 0.02 | Ratio of code blocks to prose text (scoring.ts:65-89) |
| constraintDensity | 0.03 | Frequency of constraints (“at least”, “must be”) (scoring.ts:91-124) |
Contextual Dimensions (4)
Leverage request metadata:| Dimension | Weight | Purpose |
|---|---|---|
| expectedOutputLength | 0.04 | Signals like “comprehensive”, high max_tokens (contextual.ts:15-36) |
| repetitionRequests | 0.02 | Requests for multiple variations (“10 examples”) (contextual.ts:38-50) |
| toolCount | 0.04 | Number of available tools (contextual.ts:52-77) |
| conversationDepth | 0.03 | Multi-turn conversation length (contextual.ts:79-85) |
Scoring Process
Message Filtering
Strip
system and developer roles (excluded from scoring to prevent system prompt inflation). Take only the last 10 user messages (source: proxy.service.ts:15-16,87-89).Dimension Scoring
- Keyword dimensions: Count matches, apply density bonuses, weight by message position
- Structural dimensions: Parse text for patterns (lists, conditionals, code blocks)
- Contextual dimensions: Analyze metadata (tools, max_tokens, conversation length)
Override Rules
Apply tier floors:
- Tool use (when
tool_choice ≠ 'none'): min tier = Standard - Large context (>50K tokens): min tier = Complex
- Formal logic keywords: force tier = Reasoning
Example Scoring
Momentum System
Momentum tracks recent tier assignments per session to provide continuity (source:routing.ts:13-19,84-129):
- Storage: In-memory map keyed by session ID
- Capacity: Last 5 tier assignments
- TTL: 30 minutes of inactivity
- Effect: Biases scoring toward recent tiers to maintain context
How It Works
- Short Message Bypass: Messages under 50 chars without tools normally default to Simple tier. With momentum, the score is adjusted upward if recent tiers suggest ongoing complex work.
- Stability: Prevents rapid tier switching in multi-turn conversations.
- Cleanup: Background timer purges stale entries every 5 minutes.
Model Resolution
Once a tier is determined, Manifest selects the actual model:Tier Assignment Flow
Lookup Tier Config
Query
tier_assignments table for the agent’s tier configuration (source: resolve.service.ts:31-32).Choose Model
- Override Model: User-specified model for the tier (takes precedence)
- Auto-assigned Model: Model selected by tier auto-assignment service
- Fallback: If neither exists, return
null(request fails)
Tier Auto-Assignment
TheTierAutoAssignService maintains optimal model selections:
- Data Source:
model_pricingentity with input/output costs - Selection Criteria: Balance cost and capability (lowest cost per tier)
- Update Frequency: Runs on pricing sync
- Storage:
tier_assignments.auto_assigned_model
Proxy Mode
Manifest can act as an OpenAI-compatible proxy for transparent routing:Endpoint
Flow
- Gateway sends request to Manifest with
model: "auto" - Manifest scores messages using the 23-dimension algorithm
- Resolves actual model (e.g.,
claude-3-5-sonnet-20241022) - Forwards to real provider with provider’s API key
- Streams response back to gateway
- Records momentum for session continuity
Heartbeat Detection
OpenClaw gateways send periodic heartbeat requests to keep connections alive. These contain the sentinel string"HEARTBEAT_OK" and are automatically routed to Simple tier without scoring (source: proxy.service.ts:91-106).
Rate Limiting & Alerts
Notification Rules
Set thresholds for automatic alerts:- Metrics: Token count, cost, message count, error rate
- Periods: Hourly, daily, weekly, monthly
- Channels: Email (Mailgun, Resend, SMTP)
- Enforcement: Proxy checks limits before routing (returns 429 if exceeded)
Limit Check
When a request arrives at the proxy:Performance
- Keyword Matching: O(n) using trie data structure
- Scoring Latency: Less than 10ms for typical requests
- Momentum Lookup: O(1) in-memory map access
- Database Queries: Cached tier assignments (5-minute TTL)
- Resolve Timeout: 3 seconds (plugin-side, source:
routing.ts:20)
Configuration
The scoring algorithm is configurable via theScorerConfig interface:
Adjusting Thresholds
To customize routing behavior, modify:- Dimension weights: Increase/decrease influence of specific signals
- Tier boundaries: Adjust score ranges for tiers
- Confidence threshold: Change ambiguity handling
Next Steps
Configure Routing
Set up tier assignments and provider connections
Monitor Costs
Track routing decisions and cost savings