Skip to main content

Overview

Manifest uses a multi-dimensional scoring algorithm to analyze incoming LLM requests and route them to the most appropriate model tier. The system evaluates 23 distinct dimensions spanning keyword analysis, structural patterns, and contextual signals.

Tier System

Requests are classified into four tiers based on complexity:

Simple

Score: < -0.10
Basic questions, greetings, acknowledgments. Best for fast, cheap models.

Standard

Score: -0.10 to 0.08
Code generation, tool use, standard queries. Mid-tier models.

Complex

Score: 0.08 to 0.35
Large context, nested logic, multi-step workflows. Advanced models.

Reasoning

Score: > 0.35
Formal proofs, deep analysis, complex problem-solving. Frontier models.

Scoring Algorithm

23 Dimensions

The algorithm evaluates these dimensions (source: packages/backend/src/routing/scorer/config.ts:112-137):

Keyword Dimensions (14)

These dimensions scan message content for specific keyword patterns:
DimensionWeightDirectionPurpose
formalLogic0.07Detects formal proofs, theorems, mathematical reasoning
analyticalReasoning0.06Identifies comparison, evaluation, trade-off analysis
codeGeneration0.06Recognizes code creation requests
codeReview0.05Finds debugging, error analysis, optimization
technicalTerms0.07Counts technical vocabulary (kubernetes, GraphQL, etc.)
simpleIndicators0.08Detects basic queries (“what is”, “hello”, “thanks”)
multiStep0.07Identifies sequential workflows (“first”, “then”, “step 1”)
creative0.03Finds creative tasks (story, poem, brainstorm)
questionComplexity0.03Detects complex question structures
imperativeVerbs0.02Counts action verbs (build, deploy, configure)
outputFormat0.02Recognizes format requests (JSON, YAML, table)
domainSpecificity0.05Finds domain-specific terms (HIPAA, regression, genome)
agenticTasks0.03Identifies agent-like operations (triage, orchestrate)
relay0.02Detects simple forwarding (“just say”, “notify”)
Direction: ↑ increases complexity score, ↓ decreases it. The simpleIndicators dimension is the strongest downward signal (weight 0.08).

Structural Dimensions (5)

Analyze message structure without keyword matching:
DimensionWeightPurpose
tokenCount0.05Longer messages suggest complexity (scoring.ts:12-19)
nestedListDepth0.03Multi-level lists indicate structured requirements (scoring.ts:21-35)
conditionalLogic0.03”if/then”, “unless”, “depending on” patterns (scoring.ts:37-63)
codeToProse0.02Ratio of code blocks to prose text (scoring.ts:65-89)
constraintDensity0.03Frequency of constraints (“at least”, “must be”) (scoring.ts:91-124)

Contextual Dimensions (4)

Leverage request metadata:
DimensionWeightPurpose
expectedOutputLength0.04Signals like “comprehensive”, high max_tokens (contextual.ts:15-36)
repetitionRequests0.02Requests for multiple variations (“10 examples”) (contextual.ts:38-50)
toolCount0.04Number of available tools (contextual.ts:52-77)
conversationDepth0.03Multi-turn conversation length (contextual.ts:79-85)

Scoring Process

1

Message Filtering

Strip system and developer roles (excluded from scoring to prevent system prompt inflation). Take only the last 10 user messages (source: proxy.service.ts:15-16,87-89).
2

Text Extraction

Combine message content with position weighting (recent messages matter more).
3

Keyword Scanning

Use a trie data structure to efficiently match all keywords in O(n) time.
4

Dimension Scoring

  • Keyword dimensions: Count matches, apply density bonuses, weight by message position
  • Structural dimensions: Parse text for patterns (lists, conditionals, code blocks)
  • Contextual dimensions: Analyze metadata (tools, max_tokens, conversation length)
5

Weighted Aggregation

rawScore = Σ(dimension.rawScore × dimension.weight) across all 23 dimensions.
6

Momentum Application

Adjust score based on recent tier history (see Momentum section below).
7

Override Rules

Apply tier floors:
  • Tool use (when tool_choice ≠ 'none'): min tier = Standard
  • Large context (>50K tokens): min tier = Complex
  • Formal logic keywords: force tier = Reasoning
8

Confidence Check

If confidence < 0.45, default to Standard tier with reason = “ambiguous”.

Example Scoring

// Simple greeting
Input: { messages: [{ role: "user", content: "Hello!" }] }
Score: -0.30 | Tier: simple | Confidence: 0.90 | Reason: short_message

// Code generation
Input: { 
  messages: [{ 
    role: "user", 
    content: "Write a TypeScript function to parse CSV files" 
  }]
}
Score: 0.05 | Tier: standard | Confidence: 0.78 | Reason: scored
Matched: codeGeneration(0.06), technicalTerms(0.04), imperativeVerbs(0.02)

// Complex analysis
Input: {
  messages: [{ 
    role: "user",
    content: "Compare the trade-offs between microservices and monolithic architectures. Analyze latency, scalability, and operational complexity."
  }]
}
Score: 0.22 | Tier: complex | Confidence: 0.85 | Reason: scored
Matched: analyticalReasoning(0.06), technicalTerms(0.07), multiStep(0.04)

// Formal proof
Input: {
  messages: [{ 
    role: "user",
    content: "Prove by induction that the sum of the first n integers is n(n+1)/2"
  }]
}
Score: 0.50 | Tier: reasoning | Confidence: 0.95 | Reason: formal_logic_override

Momentum System

Momentum tracks recent tier assignments per session to provide continuity (source: routing.ts:13-19,84-129):
  • Storage: In-memory map keyed by session ID
  • Capacity: Last 5 tier assignments
  • TTL: 30 minutes of inactivity
  • Effect: Biases scoring toward recent tiers to maintain context

How It Works

  1. Short Message Bypass: Messages under 50 chars without tools normally default to Simple tier. With momentum, the score is adjusted upward if recent tiers suggest ongoing complex work.
  2. Stability: Prevents rapid tier switching in multi-turn conversations.
  3. Cleanup: Background timer purges stale entries every 5 minutes.
// Example from routing.ts:53-146
export async function resolveRouting(
  config: ManifestConfig,
  messages: unknown[],
  sessionKey: string,
  logger: PluginLogger,
): Promise<{ tier: string; model: string; provider: string; reason: string } | null> {
  // Get recent tiers from momentum map
  const entry = momentum.get(sessionKey);
  const recentTiers = entry && Date.now() - entry.lastUpdated < MOMENTUM_TTL_MS
    ? entry.tiers
    : undefined;

  // Score with momentum
  const resolved = await fetch(`${baseUrl}/api/v1/routing/resolve`, {
    method: 'POST',
    body: JSON.stringify({ messages, recentTiers })
  });

  // Update momentum with new tier
  if (existing) {
    existing.tiers = [data.tier, ...existing.tiers].slice(0, MOMENTUM_MAX);
    existing.lastUpdated = Date.now();
  }
}

Model Resolution

Once a tier is determined, Manifest selects the actual model:

Tier Assignment Flow

1

Lookup Tier Config

Query tier_assignments table for the agent’s tier configuration (source: resolve.service.ts:31-32).
2

Choose Model

  • Override Model: User-specified model for the tier (takes precedence)
  • Auto-assigned Model: Model selected by tier auto-assignment service
  • Fallback: If neither exists, return null (request fails)
3

Provider Lookup

Resolve provider name from model_pricing cache (source: resolve.service.ts:66-72).
4

API Key Retrieval

Fetch encrypted provider API key from user_providers table.

Tier Auto-Assignment

The TierAutoAssignService maintains optimal model selections:
  • Data Source: model_pricing entity with input/output costs
  • Selection Criteria: Balance cost and capability (lowest cost per tier)
  • Update Frequency: Runs on pricing sync
  • Storage: tier_assignments.auto_assigned_model

Proxy Mode

Manifest can act as an OpenAI-compatible proxy for transparent routing:

Endpoint

POST /v1/chat/completions
Authorization: Bearer mnfst_...

Flow

  1. Gateway sends request to Manifest with model: "auto"
  2. Manifest scores messages using the 23-dimension algorithm
  3. Resolves actual model (e.g., claude-3-5-sonnet-20241022)
  4. Forwards to real provider with provider’s API key
  5. Streams response back to gateway
  6. Records momentum for session continuity

Heartbeat Detection

OpenClaw gateways send periodic heartbeat requests to keep connections alive. These contain the sentinel string "HEARTBEAT_OK" and are automatically routed to Simple tier without scoring (source: proxy.service.ts:91-106).
const isHeartbeat = scoringMessages.some((m) => {
  if (m.role !== 'user') return false;
  if (typeof m.content === 'string') return m.content.includes('HEARTBEAT_OK');
  // Also handles multi-modal content format
});

const resolved = isHeartbeat
  ? await resolveService.resolveForTier(agentId, 'simple')
  : await resolveService.resolve(agentId, messages, ...);

Rate Limiting & Alerts

Notification Rules

Set thresholds for automatic alerts:
  • Metrics: Token count, cost, message count, error rate
  • Periods: Hourly, daily, weekly, monthly
  • Channels: Email (Mailgun, Resend, SMTP)
  • Enforcement: Proxy checks limits before routing (returns 429 if exceeded)

Limit Check

When a request arrives at the proxy:
// Source: proxy.service.ts:58-79
const exceeded = await this.limitCheck.checkLimits(tenantId, agentName);
if (exceeded) {
  throw new HttpException(
    {
      error: {
        message: `Limit exceeded: ${exceeded.metricType} usage exceeds threshold`,
        type: 'rate_limit_exceeded',
        code: 'limit_exceeded',
      },
    },
    429,
  );
}

Performance

  • Keyword Matching: O(n) using trie data structure
  • Scoring Latency: Less than 10ms for typical requests
  • Momentum Lookup: O(1) in-memory map access
  • Database Queries: Cached tier assignments (5-minute TTL)
  • Resolve Timeout: 3 seconds (plugin-side, source: routing.ts:20)

Configuration

The scoring algorithm is configurable via the ScorerConfig interface:
// Default config: packages/backend/src/routing/scorer/config.ts:112-142
export const DEFAULT_CONFIG: ScorerConfig = {
  dimensions: [...], // 23 dimensions with weights
  boundaries: { 
    simpleMax: -0.10, 
    standardMax: 0.08, 
    complexMax: 0.35 
  },
  confidenceK: 8,
  confidenceMidpoint: 0.15,
  confidenceThreshold: 0.45,
};

Adjusting Thresholds

To customize routing behavior, modify:
  • Dimension weights: Increase/decrease influence of specific signals
  • Tier boundaries: Adjust score ranges for tiers
  • Confidence threshold: Change ambiguity handling
Changing scorer config requires backend code changes. User-configurable thresholds are planned for a future release.

Next Steps

Configure Routing

Set up tier assignments and provider connections

Monitor Costs

Track routing decisions and cost savings

Build docs developers (and LLMs) love