Cline Integration - CLI Proxy API

Overview

Cline (formerly Claude Dev) is a VS Code extension that provides autonomous AI coding assistance. Cline supports OpenAI-compatible API endpoints, allowing you to use CLI Proxy API with your Google/ChatGPT/Claude OAuth subscriptions.

Configuration

Install Cline

Install the Cline extension from the VS Code marketplace:

Open VS Code
Go to Extensions (Cmd+Shift+X on macOS, Ctrl+Shift+X on Windows/Linux)
Search for “Cline”
Click Install

Start CLI Proxy API

Ensure CLI Proxy API is running:

./cliproxyapi

The server will listen on http://localhost:8317 by default.

Configure Cline API Settings

Open Cline settings in VS Code:

Open Command Palette (Cmd+Shift+P on macOS, Ctrl+Shift+P on Windows/Linux)
Type “Cline: Open Settings”
Select OpenAI Compatible as the API provider

Set Endpoint and API Key

Configure the connection:

API Provider: OpenAI Compatible
Base URL: http://localhost:8317/v1
API Key: Use any key from your api-keys list in config.yaml
Model: Select from available models (e.g., gemini-2.5-pro, claude-sonnet-4)

Configuration Examples

Using Gemini OAuth

If you have Gemini CLI OAuth configured:

settings.json

{
  "cline.apiProvider": "openai-compatible",
  "cline.baseUrl": "http://localhost:8317/v1",
  "cline.apiKey": "your-api-key-1",
  "cline.model": "gemini-2.5-pro"
}

Using Claude OAuth

If you have Claude Code OAuth configured:

settings.json

{
  "cline.apiProvider": "openai-compatible",
  "cline.baseUrl": "http://localhost:8317/v1",
  "cline.apiKey": "your-api-key-1",
  "cline.model": "claude-sonnet-4"
}

Using OpenAI Codex

If you have OpenAI Codex OAuth configured:

settings.json

{
  "cline.apiProvider": "openai-compatible",
  "cline.baseUrl": "http://localhost:8317/v1",
  "cline.apiKey": "your-api-key-1",
  "cline.model": "gpt-5"
}

Advanced Configuration

Multiple API Keys

If you want to use different API keys for different projects:

config.yaml

api-keys:
  - "project-a-key"
  - "project-b-key"
  - "personal-key"

Then configure Cline per-project using workspace settings (.vscode/settings.json):

.vscode/settings.json

{
  "cline.apiKey": "project-a-key",
  "cline.model": "gemini-2.5-pro"
}

Model Prefixes

If you have multiple credentials with prefixes:

config.yaml

gemini-api-key:
  - api-key: "AIzaSy...01"
    prefix: "work"
  - api-key: "AIzaSy...02"
    prefix: "personal"

Use the prefix in your model selection:

settings.json

{
  "cline.model": "work/gemini-2.5-pro"
}

Custom Model Aliases

Create model aliases for easier switching:

config.yaml

oauth-model-alias:
  gemini-cli:
    - name: "gemini-2.5-pro"
      alias: "g2.5p"
  claude:
    - name: "claude-sonnet-4-5-20250929"
      alias: "cs4.5"

Then in Cline:

settings.json

{
  "cline.model": "g2.5p"  // Maps to gemini-2.5-pro
}

Features

Autonomous Coding

Cline can autonomously:

Read and analyze your codebase
Create, edit, and delete files
Run terminal commands
Search for information
Debug issues

Streaming Responses

Cline supports streaming responses through CLI Proxy API:

Real-time code generation
Progressive task execution
Instant feedback on actions

Function Calling

For models that support function calling (Gemini, OpenAI, Claude):

File system operations
Code analysis tools
Search and navigation
Terminal integration

Multimodal Support

For models that support images (Gemini, Claude):

Analyze screenshots
Debug UI issues
Design reviews

Workflow Examples

Example 1: Full-Stack Development

Configure Cline

settings.json

{
  "cline.baseUrl": "http://localhost:8317/v1",
  "cline.apiKey": "your-api-key-1",
  "cline.model": "gemini-2.5-pro"
}

Give Cline a Task

Open Cline panel and type:

Create a REST API with Express.js that has CRUD endpoints for a task management system

Review and Approve

Cline will:

Create necessary files
Write the code
Ask for approval before executing commands

Test and Iterate

Continue the conversation:

Add input validation and error handling

Example 2: Bug Fixing

Select Claude Model

settings.json

{
  "cline.model": "claude-sonnet-4"
}

Describe the Bug

The user authentication is failing with a 401 error. 
Check the auth middleware and fix the issue.

Let Cline Investigate

Cline will:

Read relevant files
Analyze the code
Propose fixes
Execute tests

Troubleshooting

Connection Errors

If Cline shows connection errors:

Verify CLI Proxy API is running:
```
curl http://localhost:8317/v1/models
```
Check the Base URL in Cline settings matches your config
Ensure no firewall is blocking localhost connections

Authentication Failures

If you see “Invalid API key” or authentication errors:

Verify the API key in Cline matches one in your config.yaml:
config.yaml
```
api-keys:
  - "your-api-key-1"
```
Check for whitespace or special characters
Restart CLI Proxy API after config changes

Model Not Available

If Cline can’t use a specific model:

Authenticate with the provider first:

./cliproxyapi gemini login
./cliproxyapi claude login

Verify the model is listed:

curl -H "Authorization: Bearer your-api-key-1" \
  http://localhost:8317/v1/models | jq '.data[].id'

Check provider configuration in config.yaml

Slow Responses

If responses are slower than expected:

Check your network connection to OAuth providers
Enable debug logging to see timing info:
config.yaml
```
debug: true
```
Consider using faster models (e.g., gemini-2.5-flash instead of gemini-2.5-pro)
Configure multiple accounts for load balancing

Best Practices

1. Choose the Right Model

Quick tasks: gemini-2.5-flash, claude-haiku-4
Complex reasoning: gemini-2.5-pro, claude-sonnet-4
Maximum capability: claude-opus-4, gpt-5

2. Use Workspace Settings

Create project-specific settings in .vscode/settings.json:

.vscode/settings.json

{
  "cline.apiKey": "project-specific-key",
  "cline.model": "gemini-2.5-pro"
}

3. Configure Multiple Providers

Authenticate with multiple providers for redundancy:

./cliproxyapi gemini login
./cliproxyapi claude login
./cliproxyapi codex login

4. Enable Load Balancing

config.yaml

routing:
  strategy: "round-robin"
  
max-retry-credentials: 3

5. Monitor Usage

config.yaml

usage-statistics-enabled: true

Then check usage via Management API or logs.

Performance Optimization

Reduce Latency

config.yaml

# Enable streaming for faster first-token response
streaming:
  keepalive-seconds: 15
  bootstrap-retries: 1

Handle Rate Limits

config.yaml

quota-exceeded:
  switch-project: true
  switch-preview-model: true
  
max-retry-credentials: 3
max-retry-interval: 30

Optimize for Concurrent Requests

config.yaml

# Enable commercial mode for high-concurrency environments
commercial-mode: true

# Configure multiple accounts
gemini-api-key:
  - api-key: "key1"
  - api-key: "key2"
  - api-key: "key3"

Get Started

Core Concepts

Configuration

OAuth Authentication

Integrations

Deployment

​Overview

​Configuration

​Configuration Examples

​Using Gemini OAuth

​Using Claude OAuth

​Using OpenAI Codex

​Advanced Configuration

​Multiple API Keys

​Model Prefixes

​Custom Model Aliases

​Features

​Autonomous Coding

​Streaming Responses

​Function Calling

​Multimodal Support

​Workflow Examples

​Example 1: Full-Stack Development

​Example 2: Bug Fixing

​Troubleshooting

​Connection Errors

​Authentication Failures

​Model Not Available

​Slow Responses

​Best Practices

​1. Choose the Right Model

​2. Use Workspace Settings

​3. Configure Multiple Providers

​4. Enable Load Balancing

​5. Monitor Usage

​Performance Optimization

​Reduce Latency

​Handle Rate Limits

​Optimize for Concurrent Requests

​See Also

Build docs developers (and LLMs) love

Overview

Configuration

Configuration Examples

Using Gemini OAuth

Using Claude OAuth

Using OpenAI Codex

Advanced Configuration

Multiple API Keys

Model Prefixes

Custom Model Aliases

Features

Autonomous Coding

Streaming Responses

Function Calling

Multimodal Support

Workflow Examples

Example 1: Full-Stack Development

Example 2: Bug Fixing

Troubleshooting

Connection Errors

Authentication Failures

Model Not Available

Slow Responses

Best Practices

1. Choose the Right Model

2. Use Workspace Settings

3. Configure Multiple Providers

4. Enable Load Balancing

5. Monitor Usage

Performance Optimization

Reduce Latency

Handle Rate Limits

Optimize for Concurrent Requests

See Also