Providers - switchAILocal

Overview

Providers are the backbone of switchAILocal, enabling access to various AI services through a unified interface. Each provider type has unique characteristics and authentication requirements.

Provider Types

switchAILocal supports three distinct categories of providers:

CLI Providers

Local command-line tools running on your machine

Cloud Providers

Remote API services accessed via HTTP

OAuth Providers

Services requiring OAuth2 authentication flows

CLI Providers

CLI providers execute locally-installed AI tools and expose them through the proxy.

Ollama

Local model server for running open-source LLMs.

ollama:
  enabled: true
  base-url: "http://localhost:11434"
  auto-discover: true  # Automatically fetch available models

Implementation Details

The OllamaExecutor (internal/runtime/executor/ollama_executor.go) translates OpenAI format to Ollama’s native API:

type OllamaExecutor struct {
    cfg     *config.Config
    baseURL string
    client  *http.Client
}

Key features:

Auto-discovery: Queries /api/tags for available models
Vision support: Handles base64-encoded images
Streaming: Real-time response chunks
No authentication: Direct HTTP access

Example models:

ollama:llama3.2
ollama:mistral
ollama:qwen3-vl:235b-instruct-cloud

OpenCode

Local AI coding agent for development tasks.

opencode:
  enabled: true
  base-url: "http://localhost:4096"
  default-agent: "build"

OpenCode agents are specialized for different tasks: build, debug, test, refactor.

LM Studio

GUI application for running local models with OpenAI-compatible API.

lmstudio:
  enabled: true
  base-url: "http://localhost:1234/v1"
  auto-discover: true

Gemini CLI

Access Google’s Gemini via OAuth credentials without API keys.

# Use OAuth login instead of API keys
# Run: switchAILocal -login

Authentication Flow

The GeminiCLIExecutor uses OAuth2 with device flow:

type GeminiCLIExecutor struct {
    cfg *config.Config
}

func prepareGeminiCLITokenSource(ctx, cfg, auth) (TokenSource, error) {
    // Exchange refresh token for access token
    // Endpoint: https://cloudcode-pa.googleapis.com
}

Scopes required:

https://www.googleapis.com/auth/cloud-platform
https://www.googleapis.com/auth/userinfo.email

Cloud Providers

Cloud providers access remote APIs using API keys or OAuth tokens.

Gemini API

Google’s Gemini models via REST API.

gemini-api-key:
  - api-key: "AIzaSy..."
    prefix: "google"
    base-url: "https://generativelanguage.googleapis.com"

Supported endpoints:

/v1beta/models/{model}:generateContent
/v1beta/models/{model}:streamGenerateContent
/v1beta/models/{model}:countTokens

Use the prefix to disambiguate when you have multiple Gemini providers configured. Access as google/gemini-2.0-flash.

Claude (Anthropic)

Anthropic’s Claude models.

claude-api-key:
  - api-key: "sk-ant-..."
    models:
      - name: "claude-3-5-sonnet-20241022"
        alias: "sonnet"
      - name: "claude-3-opus-20240229"
        alias: "opus"

Message Format Translation

The ClaudeExecutor translates between OpenAI and Claude formats:

// OpenAI format
{
  "model": "gpt-4",
  "messages": [{"role": "user", "content": "Hello"}]
}

// Claude format (translated)
{
  "model": "claude-3-5-sonnet-20241022",
  "messages": [{"role": "user", "content": "Hello"}],
  "max_tokens": 4096
}

Note: Claude requires explicit max_tokens parameter.

OpenAI / Codex

OpenAI’s GPT models.

codex-api-key:
  - api-key: "sk-..."
    base-url: "https://api.openai.com/v1"

Available models:

GPT-4 series: gpt-4, gpt-4-turbo, gpt-4o
GPT-3.5 series: gpt-3.5-turbo
O-series: o1, o1-mini, o1-preview

SwitchAI Cloud

Unified access to 100+ cloud models through single API key.

switchai-api-key:
  - api-key: "sk-lf-..."
    base-url: "https://switchai.traylinx.com/v1"
    models:
      - name: "openai/gpt-oss-120b"
        alias: "switchai-fast"
      - name: "deepseek-reasoner"
        alias: "switchai-reasoner"

SwitchAI provides access to models from OpenAI, Anthropic, Google, DeepSeek, and more through a single endpoint.

OpenAI-Compatible Providers

Many providers offer OpenAI-compatible endpoints that can be configured using openai-compatibility.

openai-compatibility:
  - name: "groq"
    prefix: "groq"
    base-url: "https://api.groq.com/openai/v1"
    api-key-entries:
      - api-key: "gsk_..."
  
  - name: "openrouter"
    prefix: "or"
    base-url: "https://openrouter.ai/api/v1"
    api-key-entries:
      - api-key: "sk-or-v1-..."
  
  - name: "together"
    prefix: "together"
    base-url: "https://api.together.xyz/v1"
    api-key-entries:
      - api-key: "..."

OpenAICompatExecutor Implementation

The OpenAICompatExecutor is a generic executor for OpenAI-compatible APIs:

type OpenAICompatExecutor struct {
    provider string  // e.g., "groq", "openrouter"
    cfg      *config.Config
}

func (e *OpenAICompatExecutor) Identifier() string {
    return e.provider
}

Supports:

Chat completions: /chat/completions
Image generation: /images/generations
Audio transcription: /audio/transcriptions
Audio speech: /audio/speech

Supported Services

Groq

Ultra-fast inference with LPU technologyModels: Llama, Mixtral, Gemma

OpenRouter

Access to 100+ models with unified billingModels: GPT-4, Claude, Gemini, and more

Together AI

Open-source models with fast inferenceModels: Llama, Mistral, Qwen

Anyscale

Serverless endpoints for OSS modelsModels: Llama, Mixtral, CodeLlama

Provider Lifecycle

Registration

Providers are registered during service initialization:

// Register executors in service builder
func (b *ServiceBuilder) Build() (*Service, error) {
    // Register Gemini CLI
    geminiCLIExec := executor.NewGeminiCLIExecutor(b.cfg)
    b.coreManager.RegisterExecutor(geminiCLIExec)
    
    // Register Ollama
    ollamaExec := executor.NewOllamaExecutor(b.cfg)
    b.coreManager.RegisterExecutor(ollamaExec)
    
    // Register OpenAI compat providers
    for _, compatCfg := range b.cfg.OpenAICompatibility {
        exec := executor.NewOpenAICompatExecutor(compatCfg.Name, b.cfg)
        b.coreManager.RegisterExecutor(exec)
    }
}

Discovery

Providers with auto-discover: true fetch available models:

// Ollama auto-discovery
func (e *OllamaExecutor) DiscoverModels() ([]string, error) {
    resp, err := e.client.Get(e.baseURL + "/api/tags")
    // Parse and return model list
}

Health Checking

Providers can implement health checks:

type ProviderHealthChecker interface {
    CheckHealth(ctx context.Context) error
}

Model Naming Conventions

Providers use prefixes to avoid naming conflicts:

# Without prefix (uses first matching provider)
model: gpt-4o

# With provider prefix
model: openrouter/gpt-4o
model: groq/llama-3.2-90b
model: ollama:mistral

# With custom alias
model: sonnet  # Maps to claude-3-5-sonnet-20241022

Use force-model-prefix: true in config to require explicit prefixes for all requests.

Provider Metadata

Each provider auth includes metadata:

type Auth struct {
    Provider   string
    Metadata   map[string]any
    Attributes map[string]string
}

// Example metadata
metadata: {
    "source": "config_yaml",
    "base_url": "https://api.groq.com/openai/v1",
    "prefix": "groq",
    "auto_discover": true
}

Proxy Configuration

Providers can use HTTP proxies:

# Global proxy for all providers
proxy-url: "socks5://user:[email protected]:1080/"

# Per-provider proxy
gemini-api-key:
  - api-key: "AIzaSy..."
    proxy-url: "http://proxy.example.com:8080"

Supported proxy protocols: http://, https://, socks5://

Error Handling

Providers implement standardized error responses:

type Error struct {
    Code       string  // "auth_failed", "quota_exceeded", etc.
    Message    string
    HTTPStatus int     // 401, 429, 500, etc.
    Retryable  bool
}

Common error codes:

401: Invalid API key or expired token
402/403: Payment required or forbidden
404: Model not found
429: Rate limit exceeded (triggers cooldown)
500/502/503/504: Transient server errors

Provider Quotas

Quota management prevents excessive retry storms:

type QuotaState struct {
    Exceeded      bool
    Reason        string
    NextRecoverAt time.Time
    BackoffLevel  int  // Exponential backoff level
}

Backoff schedule:

Level 0: 1 second
Level 1: 2 seconds
Level 2: 4 seconds
…
Max: 30 minutes

Quota Recovery

When a provider returns 429 Too Many Requests:

Mark the credential as quota-exceeded
Set NextRecoverAt based on Retry-After header or exponential backoff
Skip this credential during selection until recovery time
Reset quota state on successful request

case 429:
    var next time.Time
    if result.RetryAfter != nil {
        next = now.Add(*result.RetryAfter)
    } else {
        cooldown, nextLevel := nextQuotaCooldown(backoffLevel)
        next = now.Add(cooldown)
        backoffLevel = nextLevel
    }
    state.Quota = QuotaState{
        Exceeded:      true,
        NextRecoverAt: next,
        BackoffLevel:  backoffLevel,
    }

Next Steps

Add Provider

Configure a new provider in your setup

Routing Strategies

Learn how requests are routed to providers

Authentication

Understand authentication flows

Model Discovery

Automatic model discovery and registration

Get Started

Core Concepts

Configuration

Intelligent Systems

Advanced Features

Guides

​Overview

​Provider Types

CLI Providers

Cloud Providers

OAuth Providers

​CLI Providers

​Ollama

​OpenCode

​LM Studio

​Gemini CLI

​Cloud Providers

​Gemini API

​Claude (Anthropic)

​OpenAI / Codex

​SwitchAI Cloud

​OpenAI-Compatible Providers

​Supported Services

Groq

OpenRouter

Together AI

Anyscale

​Provider Lifecycle

​Registration

​Discovery

​Health Checking

​Model Naming Conventions

​Provider Metadata

​Proxy Configuration

​Error Handling

​Provider Quotas

​Next Steps

Add Provider

Routing Strategies

Authentication

Model Discovery

Build docs developers (and LLMs) love

Overview

Provider Types

CLI Providers

Ollama

OpenCode

LM Studio

Gemini CLI

Cloud Providers

Gemini API

Claude (Anthropic)

OpenAI / Codex

SwitchAI Cloud

OpenAI-Compatible Providers

Supported Services

Provider Lifecycle

Registration

Discovery

Health Checking

Model Naming Conventions

Provider Metadata

Proxy Configuration

Error Handling

Provider Quotas

Next Steps