Skip to main content
OpenRouter provides access to multiple LLM providers through a single API, with automatic fallback, cost tracking, and model routing.

Configuration

openRouterApiKey
string
required
Your OpenRouter API key from OpenRouter Dashboard
openRouterModelId
string
required
Model identifier in OpenRouter format. Examples:
  • anthropic/claude-4.5-sonnet
  • openai/gpt-4o
  • google/gemini-3.0-flash-thinking
  • deepseek/deepseek-reasoner
  • xai/grok-4
openRouterModelInfo
ModelInfo
Model metadata. OpenRouter fetches this automatically if not provided.
openRouterProviderSorting
string
Provider preference order. Examples:
  • "anthropic,openai" - Prefer Anthropic, fallback to OpenAI
  • "google" - Use only Google
reasoningEffort
string
For reasoning models. Options: low, medium, high, xhigh
thinkingBudgetTokens
number
Token budget for extended thinking (model-dependent)

Basic Setup

import { OpenRouterHandler } from "./providers/openrouter"

const handler = new OpenRouterHandler({
  openRouterApiKey: process.env.OPENROUTER_API_KEY,
  openRouterModelId: "anthropic/claude-4.5-sonnet"
})

Provider Routing

Control which providers handle your requests:
const handler = new OpenRouterHandler({
  openRouterApiKey: process.env.OPENROUTER_API_KEY,
  openRouterModelId: "anthropic/claude-sonnet-4",
  openRouterProviderSorting: "anthropic,openrouter"
  // Try Anthropic first, fallback to OpenRouter's default
})

Reasoning Models

OpenRouter supports various reasoning models:

DeepSeek Reasoner

const handler = new OpenRouterHandler({
  openRouterApiKey: process.env.OPENROUTER_API_KEY,
  openRouterModelId: "deepseek/deepseek-reasoner"
})

for await (const chunk of handler.createMessage(
  systemPrompt,
  messages,
  tools
)) {
  if (chunk.type === "reasoning") {
    console.log("Reasoning:", chunk.reasoning)
    
    // OpenRouter preserves reasoning_details for model
    if (chunk.details) {
      console.log("Details:", chunk.details)
    }
  }
}

Gemini with Thinking

const handler = new OpenRouterHandler({
  openRouterApiKey: process.env.OPENROUTER_API_KEY,
  openRouterModelId: "google/gemini-3.0-flash-thinking",
  reasoningEffort: "high", // or "medium"
  thinkingBudgetTokens: 15000
})

Anthropic Extended Thinking

const handler = new OpenRouterHandler({
  openRouterApiKey: process.env.OPENROUTER_API_KEY,
  openRouterModelId: "anthropic/claude-sonnet-4",
  thinkingBudgetTokens: 10000
})

Reasoning Details Preservation

OpenRouter passes reasoning_details for models to preserve reasoning traces:
for await (const chunk of handler.createMessage(...)) {
  if (chunk.type === "reasoning" && chunk.details) {
    // These details can be sent back to the API
    // to preserve reasoning context
    console.log("Reasoning details:", chunk.details)
  }
}
From openrouter.ts:133-146:
Reasoning block preservationOpenRouter returns reasoning_details in delta responses that should be preserved and sent back in subsequent API requests to maintain reasoning traces. See OpenRouter docs.

Tool Calling

import type { ChatCompletionTool } from "openai/resources/chat/completions"

const tools: ChatCompletionTool[] = [
  {
    type: "function",
    function: {
      name: "get_weather",
      description: "Get current weather",
      parameters: {
        type: "object",
        properties: {
          location: {
            type: "string",
            description: "City name"
          }
        },
        required: ["location"]
      }
    }
  }
]

const handler = new OpenRouterHandler({
  openRouterApiKey: process.env.OPENROUTER_API_KEY,
  openRouterModelId: "anthropic/claude-4.5-sonnet"
})

for await (const chunk of handler.createMessage(
  systemPrompt,
  messages,
  tools
)) {
  if (chunk.type === "tool_calls") {
    console.log("Tool:", chunk.tool_call.function.name)
    console.log("Args:", chunk.tool_call.function.arguments)
  }
}

Cost Tracking

OpenRouter provides detailed cost information:
for await (const chunk of handler.createMessage(...)) {
  if (chunk.type === "usage") {
    console.log("Input tokens:", chunk.inputTokens)
    console.log("Output tokens:", chunk.outputTokens)
    console.log("Cached tokens:", chunk.cacheReadTokens)
    console.log("Total cost:", chunk.totalCost) // In USD
  }
}

Generation Endpoint Fallback

If usage data isn’t in the stream, the handler fetches it from the generation endpoint:
// From openrouter.ts:163-168
if (!didOutputUsage) {
  const apiStreamUsage = await handler.getApiStreamUsage()
  if (apiStreamUsage) {
    console.log("Cost from generation endpoint:", apiStreamUsage.totalCost)
  }
}

Error Handling

OpenRouter returns errors in two ways:

Direct Error Response

// From openrouter.ts:78-85
if ("error" in chunk) {
  const error = chunk.error
  console.error(`OpenRouter Error ${error.code}: ${error.message}`)
  if (error.metadata) {
    console.error("Metadata:", error.metadata)
  }
}

Mid-Stream Error

// From openrouter.ts:91-105
if (choice?.finish_reason === "error") {
  const error = choice.error
  throw new Error(`OpenRouter Mid-Stream Error: ${error}`)
}

Error Types

type OpenRouterErrorResponse = {
  error: {
    message: string
    code: number
    metadata?: {
      provider_name?: string
      raw?: unknown
      reasons?: string[] // For moderation errors
      flagged_input?: string
    }
  }
}

Streaming Response

Text Chunks

{
  type: "text",
  text: string
}

Reasoning Chunks

{
  type: "reasoning",
  reasoning: string,
  details?: any // Reasoning details to preserve
}

Tool Calls

{
  type: "tool_calls",
  tool_call: {
    function: {
      name: string,
      arguments: string
    }
  }
}

Usage Data

{
  type: "usage",
  inputTokens: number,
  outputTokens: number,
  cacheReadTokens: number,
  cacheWriteTokens: 0,
  totalCost: number // USD
}

Model Information

OpenRouter caches model info:
const handler = new OpenRouterHandler({
  openRouterApiKey: process.env.OPENROUTER_API_KEY,
  openRouterModelId: "anthropic/claude-4.5-sonnet"
})

const { id, info } = handler.getModel()

console.log("Model ID:", id)
console.log("Context window:", info.contextWindow)
console.log("Max tokens:", info.maxTokens)
console.log("Input price:", info.inputPrice)
console.log("Output price:", info.outputPrice)
Model info is cached in StateManager to avoid repeated API calls.

Skipping Reasoning for Specific Models

Some models (like Grok 4) output placeholder reasoning:
// From openrouter.ts:125
if (delta.reasoning && !shouldSkipReasoningForModel(modelId)) {
  yield {
    type: "reasoning",
    reasoning: delta.reasoning
  }
}

Advanced Usage

Multiple Provider Fallback

const handler = new OpenRouterHandler({
  openRouterApiKey: process.env.OPENROUTER_API_KEY,
  openRouterModelId: "anthropic/claude-4.5-sonnet",
  openRouterProviderSorting: "anthropic,openai,google"
  // Try Anthropic -> OpenAI -> Google
})

Custom Model Info

const handler = new OpenRouterHandler({
  openRouterApiKey: process.env.OPENROUTER_API_KEY,
  openRouterModelId: "custom/model",
  openRouterModelInfo: {
    maxTokens: 16384,
    contextWindow: 200000,
    supportsPromptCache: true,
    inputPrice: 3.0,
    outputPrice: 15.0
  }
})

Retry Configuration

const handler = new OpenRouterHandler({
  openRouterApiKey: process.env.OPENROUTER_API_KEY,
  openRouterModelId: "anthropic/claude-4.5-sonnet",
  onRetryAttempt: (error, attempt, maxRetries) => {
    console.log(`Retry ${attempt}/${maxRetries}:`, error.message)
  }
})

Generation Details API

Access detailed generation info:
const handler = new OpenRouterHandler({
  openRouterApiKey: process.env.OPENROUTER_API_KEY,
  openRouterModelId: "anthropic/claude-4.5-sonnet"
})

// Stream a message
for await (const chunk of handler.createMessage(...)) {
  // ...
}

// Get detailed usage after streaming
const usage = await handler.getApiStreamUsage()
if (usage) {
  console.log("Native prompt tokens:", usage.inputTokens)
  console.log("Native completion tokens:", usage.outputTokens)
  console.log("Cached tokens:", usage.cacheReadTokens)
  console.log("Total cost:", usage.totalCost)
}
From openrouter.ts:171-193, the generation endpoint:
  • Returns native token counts
  • Includes cache statistics
  • Provides exact cost calculations
  • Has built-in retry logic (4 attempts)

Implementation Reference

Source: ~/workspace/source/src/core/api/providers/openrouter.ts Key features:
  • Unified access to 100+ models
  • Automatic provider fallback
  • Cost tracking with generation endpoint
  • Reasoning details preservation
  • Error handling for stream and response
  • Model info caching

Common Patterns

Cost Monitoring

let totalCost = 0

for await (const chunk of handler.createMessage(...)) {
  if (chunk.type === "usage" && chunk.totalCost) {
    totalCost += chunk.totalCost
  }
}

console.log(`Total cost: $${totalCost.toFixed(4)}`)

Provider Preference

// Always use official Anthropic
const handler = new OpenRouterHandler({
  openRouterApiKey: process.env.OPENROUTER_API_KEY,
  openRouterModelId: "anthropic/claude-4.5-sonnet",
  openRouterProviderSorting: "anthropic"
})

Multi-Model Strategy

const models = {
  planning: "openai/o3-mini",
  coding: "anthropic/claude-4.5-sonnet",
  fast: "google/gemini-3.0-flash-thinking"
}

const handler = new OpenRouterHandler({
  openRouterApiKey: process.env.OPENROUTER_API_KEY,
  openRouterModelId: models.coding
})

Next Steps

Anthropic Provider

Direct Anthropic integration

OpenAI Provider

Direct OpenAI integration

Provider Overview

View all providers

Custom Provider

Build your own

Build docs developers (and LLMs) love