SimpleClaw supports multiple LLM providers (Anthropic, OpenAI, Google, Bedrock, etc.) with flexible model configuration, automatic fallbacks, and provider-specific compatibility settings.
Model Providers
Providers are configured in ~/.simpleclaw/config.yaml:
models:
mode: merge # or "replace"
providers:
anthropic:
baseUrl: https://api.anthropic.com/v1
apiKey: ${ANTHROPIC_API_KEY}
models:
- id: claude-3-7-sonnet-20250219
name: Claude 3.7 Sonnet
reasoning: false
input: [text, image]
contextWindow: 200000
maxTokens: 8192
cost:
input: 3.00 # per million tokens
output: 15.00
cacheRead: 0.30
cacheWrite: 3.75
openai:
baseUrl: https://api.openai.com/v1
apiKey: ${OPENAI_API_KEY}
models:
- id: gpt-4o
name: GPT-4 Omni
reasoning: false
input: [text, image]
contextWindow: 128000
maxTokens: 16384
cost:
input: 2.50
output: 10.00
cacheRead: 1.25
cacheWrite: 5.00
Model Configuration
Model Definition
// src/config/types.models.ts
export type ModelDefinitionConfig = {
id: string; // Unique model identifier
name: string; // Display name
api?: ModelApi; // API type override
reasoning: boolean; // Supports reasoning/thinking
input: Array<"text" | "image">; // Input modalities
cost: {
input: number; // Cost per million input tokens
output: number; // Cost per million output tokens
cacheRead: number; // Cache read cost
cacheWrite: number; // Cache write cost
};
contextWindow: number; // Maximum context length
maxTokens: number; // Maximum completion tokens
headers?: Record<string, string>; // Custom HTTP headers
compat?: ModelCompatConfig; // Compatibility settings
};
Supported APIs
export type ModelApi =
| "openai-completions" // Legacy OpenAI completions
| "openai-responses" // OpenAI chat completions
| "anthropic-messages" // Anthropic Messages API
| "google-generative-ai" // Google Gemini
| "github-copilot" // GitHub Copilot
| "bedrock-converse-stream" // AWS Bedrock
| "ollama"; // Ollama local models
Provider Authentication
API Keys (Default)
Most providers use API key authentication:
models:
providers:
anthropic:
apiKey: ${ANTHROPIC_API_KEY}
auth: api-key # default
AWS SDK (Bedrock)
Bedrock uses AWS credentials:
models:
providers:
bedrock:
baseUrl: https://bedrock-runtime.us-east-1.amazonaws.com
auth: aws-sdk
bedrockDiscovery:
enabled: true
region: us-east-1
providerFilter:
- anthropic
- meta
OAuth (Future)
OAuth-based authentication:
models:
providers:
custom-provider:
auth: oauth
oauthConfig:
clientId: ${CLIENT_ID}
clientSecret: ${CLIENT_SECRET}
tokenUrl: https://auth.example.com/token
Model Compatibility
Different models support different features:
models:
providers:
anthropic:
models:
- id: claude-3-7-sonnet-20250219
compat:
supportsStore: true # Supports prompt caching
supportsDeveloperRole: true # Supports "developer" role
supportsReasoningEffort: false # No reasoning effort control
supportsUsageInStreaming: true # Returns usage in streams
supportsStrictMode: false # No strict schema mode
maxTokensField: max_tokens # Token limit field name
thinkingFormat: openai # Thinking block format
Compatibility Flags
Model supports prompt caching (Anthropic, OpenAI with caching)
Model supports “developer” role for system instructions
Model supports reasoning effort control (OpenAI o1, o3)
Field name for max tokens: max_tokens or max_completion_tokens
Thinking block format: openai, zai, or qwen
Tool results must include tool name (some OpenAI-compatible providers)
requiresAssistantAfterToolResult
Assistant message required after tool results (some providers)
Model Selection
Agent Model Configuration
Agents can specify primary + fallback models:
agents:
list:
- id: main
model:
primary: claude-3-7-sonnet-20250219
fallbacks:
- gpt-4o
- claude-3-5-sonnet-20241022
// src/config/zod-schema.agent-model.ts
export const AgentModelSchema = z.union([
z.string(), // Simple: just model ID
z.object({
primary: z.string().optional(),
fallbacks: z.array(z.string()).optional(),
}).strict(),
]);
Model Fallback Logic
Automatic fallback on:
- 429 Rate Limit - Provider rate limit exceeded
- 401 Unauthorized - Invalid or expired API key
- 503 Service Unavailable - Provider outage
- Model Not Found - Model no longer available
// Fallback resolution (simplified)
const resolvedModel = await tryModels([
config.model.primary,
...config.model.fallbacks
]);
Bedrock Discovery
Automatic AWS Bedrock model discovery:
models:
bedrockDiscovery:
enabled: true
region: us-east-1
providerFilter:
- anthropic # Only Anthropic models
- meta # And Meta models
refreshInterval: 3600 # Refresh every hour (seconds)
defaultContextWindow: 200000
defaultMaxTokens: 8192
Bedrock discovery automatically:
- Lists available foundation models in the region
- Filters by provider (if specified)
- Creates model definitions with default settings
- Refreshes periodically to detect new models
Add custom headers to provider requests:
models:
providers:
custom:
baseUrl: https://api.custom.com/v1
apiKey: ${CUSTOM_API_KEY}
headers:
X-Custom-Header: custom-value
X-Request-ID: ${REQUEST_ID}
models:
- id: custom-model
headers:
X-Model-Specific: only-for-this-model
Model Catalog
The gateway builds a unified model catalog from all providers:
// src/gateway/server-model-catalog.ts
export async function loadGatewayModelCatalog(
cfg: SimpleClawConfig
): Promise<ModelCatalog> {
const providers = Object.entries(cfg.models?.providers ?? {});
const catalog: ModelCatalog = {};
for (const [providerId, config] of providers) {
for (const model of config.models) {
catalog[model.id] = {
...model,
providerId,
baseUrl: config.baseUrl,
auth: config.auth ?? "api-key"
};
}
}
return catalog;
}
Cost Tracking
Model costs are tracked for usage analysis:
// Calculate request cost
const inputCost = (inputTokens / 1_000_000) * model.cost.input;
const outputCost = (outputTokens / 1_000_000) * model.cost.output;
const cacheReadCost = (cacheReadTokens / 1_000_000) * model.cost.cacheRead;
const totalCost = inputCost + outputCost + cacheReadCost;
View usage:
# Show cost breakdown
simpleclaw usage --breakdown
# Filter by agent
simpleclaw usage --agent main
# Date range
simpleclaw usage --from 2026-03-01 --to 2026-03-31
Configuration Modes
Merge Mode (Default)
Merge custom providers with built-in defaults:
models:
mode: merge # Combine with built-in providers
providers:
openai:
models:
- id: gpt-4o # Overrides built-in definition
Replace Mode
Replace all built-in providers:
models:
mode: replace # Only use custom providers
providers:
custom:
baseUrl: https://api.custom.com/v1
# ... custom provider config only
Local Models (Ollama)
Run local models via Ollama:
models:
providers:
ollama:
baseUrl: http://localhost:11434
api: ollama
models:
- id: llama3:70b
name: Llama 3 70B
reasoning: false
input: [text]
contextWindow: 8192
maxTokens: 4096
cost:
input: 0.0
output: 0.0
cacheRead: 0.0
cacheWrite: 0.0
Local models have zero cost but require running Ollama locally. Great for development and privacy-sensitive use cases.
Model Override Hierarchy
Model selection follows this priority:
- Session Override - Per-session model setting
- Agent Model - Agent’s configured model
- Channel Default - Channel-specific model preference
- Global Default - Fallback to first available model
# Channel-specific models
channels:
modelByChannel:
discord:
default: claude-3-7-sonnet-20250219
telegram:
default: gpt-4o-mini
Best Practices
Use Fallbacks
Always configure fallback models to handle provider outages
Cost Awareness
Use expensive models (GPT-4, Claude Opus) only when necessary
Environment Variables
Store API keys in environment variables, never hardcode them
Model-Specific Compat
Set compatibility flags correctly to avoid runtime errors
Changing model configuration requires gateway reload: simpleclaw gateway reload
- Agents - Agent model configuration
- Sessions - Session-level model overrides
- Gateway - Model catalog loading