What is CLI Proxy API?
CLI Proxy API is a proxy server that provides OpenAI/Gemini/Claude compatible API interfaces for AI coding assistants. It enables you to use local or multi-account CLI access with OpenAI, Gemini, Claude, and other AI coding tools through a unified API interface.Why CLI Proxy API?
- Unified Interface: Single OpenAI-compatible API for multiple AI providers
- Multi-Account Management: Use multiple accounts with automatic load balancing
- OAuth Support: Authenticate with your existing subscriptions (no API keys needed)
- Smart Routing: Automatic failover, quota management, and model aliasing
- Hot Reload: Update configuration and credentials without restarting
Architecture
CLI Proxy API follows a layered architecture that translates requests between different AI provider formats:Request Flow
When a client makes a request, it flows through these components:1. API Server
The API server (internal/api/server.go) handles incoming HTTP requests and provides multiple endpoint formats:
2. Authentication & Selection
Theauth.Manager (sdk/cliproxy/auth/conductor.go) orchestrates credential selection:
- Loads credentials from the auth directory (
~/.cli-proxy-apiby default) - Selects the best credential using the configured routing strategy
- Handles cooldowns for quota-exceeded accounts
- Auto-refreshes tokens in the background
3. Request Translation
Translators (internal/translator/) convert between API formats:
translator/gemini/- Gemini API formattranslator/claude/- Claude API formattranslator/codex/- OpenAI Codex formattranslator/antigravity/- Antigravity format
4. Provider Execution
Executors (sdk/cliproxy/executor/) handle the actual API calls:
sdk/cliproxy/executor/types.go
- Inject credentials into the provider request
- Execute HTTP calls to the provider API
- Handle streaming with SSE or other formats
- Report errors for quota management
5. Response Translation
The response flows back through translators to convert provider responses to OpenAI format:Core Concepts
Providers
A provider represents an AI service backend. Supported providers:- Gemini CLI - Google’s Gemini via OAuth
- AI Studio - Google AI Studio API keys
- Vertex AI - Google Cloud Vertex AI
- Claude Code - Anthropic Claude via OAuth
- Codex - OpenAI GPT models via OAuth
- Qwen Code - Alibaba Qwen models
- iFlow - Z.ai’s GLM models
- Antigravity - Google’s code assistance
- Kimi - Moonshot AI models
- OpenAI Compatibility - Any OpenAI-compatible endpoint
- Authentication method (OAuth, API key, service account)
- Request/response format
- Model catalog
- Quota limits
Executors
Executors implement theProviderExecutor interface:
sdk/cliproxy/auth/conductor.go
Model Registry
TheModelRegistry (internal/registry/model_registry.go) manages available models:
internal/registry/model_registry.go
- Dynamic model list - Models appear/disappear based on available credentials
- Quota tracking - Hides models when all accounts hit quota
- Multi-provider - Same model can be served by multiple providers
- Reference counting - Tracks how many accounts can serve each model
Auth State Management
The scheduler (sdk/cliproxy/auth/scheduler.go) maintains credential state:
sdk/cliproxy/auth/scheduler.go
- Credential enters cooldown state
- Next request tries a different credential
- After cooldown expires, credential returns to ready state
Configuration Architecture
Configuration is loaded fromconfig.yaml:
config.yaml
Hot Reload
CLI Proxy API watches for changes in:- Config file - Automatically reloads on changes
- Auth directory - Picks up new credentials immediately
- Token refresh - Refreshes expired OAuth tokens in background
- Adding new accounts
- Updating model mappings
- Changing routing strategy
- Modifying API keys
Streaming Support
All providers support streaming responses via Server-Sent Events (SSE):Client Example
- Keep-alive: Periodic blank lines prevent timeouts
- Bootstrap retries: Retry before first byte is sent
- Graceful errors: Errors sent as SSE events
Next Steps
Authentication
Learn about OAuth flows and multi-account management
Providers
Explore supported providers and their features
Routing
Understand request routing and load balancing
Configuration
Configure your proxy server