Skip to main content
AgentOS provides a comprehensive LLM integration layer that gives you unified access to 25 providers and 47 models through a single, consistent API.

Key Features

25 LLM Providers

Connect to major providers like Anthropic, OpenAI, Google, AWS Bedrock, DeepSeek, and 20 more

47 Models

Access frontier, smart, balanced, fast, and local tier models

Intelligent Routing

Automatic model selection based on complexity scoring

Cost Optimization

Built-in usage tracking and cost management

Architecture

The LLM system is implemented across two layers:

TypeScript Router (src/llm-router.ts)

Handles OpenAI-compatible providers through a unified interface:
  • Route selection via llm::route
  • Completion handling via llm::complete
  • Automatic retry with exponential backoff
  • Cost tracking integration

Rust Router (crates/llm-router/src/main.rs)

High-performance routing engine:
  • Complexity-based model selection
  • Provider health monitoring
  • Usage statistics aggregation
  • Multi-driver support (Anthropic, OpenAI, Gemini, Bedrock)

Model Tiers

Models are organized into 5 tiers based on capability and cost:
TierUse CaseExample Models
FrontierMost complex reasoning, researchClaude Opus 4.6, GPT o3, Gemini 2.5 Pro
SmartAdvanced tasks, codingClaude Sonnet 4.6, GPT-4o, Grok-2
BalancedGeneral purpose, cost-effectiveDeepSeek Chat, Llama 3.3 70B, Command R
FastQuick responses, simple tasksClaude Haiku 4.5, GPT-4o mini, Gemini 2.5 Flash
LocalSelf-hosted, privacy-firstOllama, vLLM, LM Studio

Quick Start

Using Complexity-Based Routing

import { trigger } from 'iii-sdk';

// Automatic model selection based on complexity
const selection = await trigger('llm::route', {
  message: 'Analyze this codebase and suggest improvements',
  toolCount: 15
});
// Returns: { provider: 'anthropic', model: 'claude-opus-4-6', maxTokens: 8192 }

// Execute completion
const result = await trigger('llm::complete', {
  model: selection,
  systemPrompt: 'You are a code review expert',
  messages: [{ role: 'user', content: 'Review this PR' }],
  tools: [{ id: 'file::read', description: 'Read file contents' }]
});

Direct Model Selection

const result = await trigger('llm::complete', {
  model: {
    provider: 'deepseek',
    model: 'deepseek-reasoner',
    maxTokens: 4096
  },
  messages: [{ role: 'user', content: 'Solve this math problem' }]
});

Provider Configuration

Providers are configured via environment variables:
# Required API keys
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENAI_API_KEY="sk-..."
export GOOGLE_API_KEY="AIza..."

# Optional providers
export DEEPSEEK_API_KEY="..."
export GROQ_API_KEY="..."
export XAI_API_KEY="..."

# Local providers (no key needed)
# Ollama: http://localhost:11434
# vLLM: http://localhost:8000
# LM Studio: http://localhost:1234

CLI Usage

# List all models
agentos models list

# List providers
agentos models providers

# Describe a specific model
agentos models describe claude-sonnet-4-6

# View model aliases
agentos models aliases

Cost Tracking

All completions automatically track usage and costs:
// Cost data is stored in state::costs
const costs = await trigger('state::get', {
  scope: 'costs',
  key: '2026-03-09'
});

// Returns:
// {
//   "claude-sonnet-4-6": { cost: 0.45, calls: 23 },
//   "gpt-4o-mini": { cost: 0.12, calls: 67 },
//   "totalCost": 0.57
// }

Response Format

All LLM completions return a standardized format:
interface CompletionResponse {
  content: string;              // Text response
  model: string;                // Model used
  toolCalls: Array<{            // Tool invocations (if any)
    callId: string;
    id: string;
    arguments: Record<string, any>;
  }>;
  usage: {
    input: number;              // Input tokens
    output: number;             // Output tokens
    total: number;              // Total tokens
  };
  durationMs: number;           // Request duration
}

Fallback & Retry

The router includes automatic retry logic:
  • 429 Rate Limit: Exponential backoff (1s, 2s, 4s)
  • Network Errors: 3 attempts before failing
  • Provider Outages: Manual fallback to alternate provider

Next Steps

Browse Providers

View all 25 supported providers

Explore Models

See all 47 available models with pricing

Routing Logic

Learn about complexity-based model selection

Build docs developers (and LLMs) love