Custom Providers - PicoClaw

Overview

PicoClaw supports any OpenAI-compatible API endpoint, enabling you to use:

Custom API proxies and gateways
Self-hosted models (VLLM, Ollama)
LiteLLM proxy for unified access
Local inference servers
Enterprise deployments

Custom API Endpoints

Basic Configuration

Any OpenAI-compatible endpoint can be configured:

{
  "model_list": [
    {
      "model_name": "my-custom-model",
      "model": "openai/custom-model",
      "api_base": "https://my-api.example.com/v1",
      "api_key": "your-api-key",
      "request_timeout": 300
    }
  ],
  "agents": {
    "defaults": {
      "model_name": "my-custom-model"
    }
  }
}

Configuration Parameters

Parameter	Type	Required	Default	Description
`model_name`	string	Yes	-	Alias for this model configuration
`model`	string	Yes	-	Model identifier (any prefix)
`api_base`	string	Yes	-	Your custom API endpoint URL
`api_key`	string	No	-	API key (if required by endpoint)
`request_timeout`	integer	No	120	Request timeout in seconds

LiteLLM Proxy

What is LiteLLM?

LiteLLM is a unified proxy that translates requests across 100+ LLM providers. It provides:

Single API for multiple providers
Load balancing and fallbacks
Cost tracking and budgets
Rate limiting
Caching

Setup LiteLLM

1. Install LiteLLM

pip install litellm[proxy]

2. Create Configuration

Create litellm_config.yaml:

model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4
      api_key: sk-...
  
  - model_name: claude
    litellm_params:
      model: anthropic/claude-sonnet-4.6
      api_key: sk-ant-...
  
  - model_name: llama
    litellm_params:
      model: ollama/llama3
      api_base: http://localhost:11434

general_settings:
  master_key: sk-1234  # Your LiteLLM proxy key

3. Start LiteLLM Proxy

litellm --config litellm_config.yaml --port 4000

4. Configure PicoClaw

Edit ~/.picoclaw/config.json:

{
  "model_list": [
    {
      "model_name": "gpt4",
      "model": "litellm/gpt-4",
      "api_base": "http://localhost:4000/v1",
      "api_key": "sk-1234"
    },
    {
      "model_name": "claude",
      "model": "litellm/claude",
      "api_base": "http://localhost:4000/v1",
      "api_key": "sk-1234"
    }
  ]
}

PicoClaw strips the litellm/ prefix, so litellm/gpt-4 sends gpt-4 to the proxy.

5. Test Connection

picoclaw agent -m "Test LiteLLM proxy"

Advanced LiteLLM Features

Load Balancing

LiteLLM config with multiple endpoints:

model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4
      api_key: sk-key1
      api_base: https://api1.example.com/v1
  
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4
      api_key: sk-key2
      api_base: https://api2.example.com/v1

PicoClaw config:

{
  "model_list": [
    {
      "model_name": "gpt4",
      "model": "litellm/gpt-4",
      "api_base": "http://localhost:4000/v1",
      "api_key": "sk-1234"
    }
  ]
}

VLLM (Self-Hosted)

What is VLLM?

VLLM is a high-performance inference server for running LLMs locally or in the cloud.

Setup VLLM

1. Install VLLM

pip install vllm

2. Start VLLM Server

vllm serve meta-llama/Llama-3-8B-Instruct \
  --host 0.0.0.0 \
  --port 8000 \
  --api-key your-api-key

3. Configure PicoClaw

Edit ~/.picoclaw/config.json:

{
  "model_list": [
    {
      "model_name": "llama3",
      "model": "vllm/Llama-3-8B-Instruct",
      "api_base": "http://localhost:8000/v1",
      "api_key": "your-api-key",
      "request_timeout": 600
    }
  ],
  "agents": {
    "defaults": {
      "model_name": "llama3"
    }
  }
}

4. Test Connection

picoclaw agent -m "Test VLLM server"

VLLM with Multiple GPUs

vllm serve meta-llama/Llama-3-70B-Instruct \
  --tensor-parallel-size 4 \
  --host 0.0.0.0 \
  --port 8000

VLLM Best Practices

GPU memory: Ensure sufficient VRAM for your model
Batch size: Tune for throughput vs latency
Context length: Set --max-model-len appropriately
Timeouts: Increase request_timeout for large contexts

Ollama (Local Models)

What is Ollama?

Ollama makes it easy to run open-source LLMs locally on your machine.

Setup Ollama

1. Install Ollama

# macOS / Linux
curl -fsSL https://ollama.ai/install.sh | sh

# Or download from https://ollama.ai

2. Pull a Model

ollama pull llama3

Available models:

llama3 - Meta Llama 3
mistral - Mistral 7B
codellama - Code Llama
qwen2.5 - Qwen 2.5
Many more at https://ollama.ai/library

3. Start Ollama Server

ollama serve

Default endpoint: http://localhost:11434

4. Configure PicoClaw

Edit ~/.picoclaw/config.json:

{
  "model_list": [
    {
      "model_name": "llama3",
      "model": "ollama/llama3",
      "api_base": "http://localhost:11434/v1"
    }
  ],
  "agents": {
    "defaults": {
      "model_name": "llama3"
    }
  }
}

No API key needed for Ollama - it’s completely local!

5. Test Connection

picoclaw agent -m "Test Ollama"

Ollama Best Practices

Model selection: Choose models that fit your hardware
Context window: Larger models support longer contexts
Performance: Use GPU for better performance
Updates: Keep Ollama updated for latest features

Custom Proxy Configuration

HTTP Proxy

Route requests through an HTTP proxy:

{
  "model_list": [
    {
      "model_name": "proxied-model",
      "model": "openai/gpt-4",
      "api_base": "https://api.openai.com/v1",
      "api_key": "sk-..."
    }
  ],
  "providers": {
    "openai": {
      "proxy": "http://proxy.example.com:8080"
    }
  }
}

Reverse Proxy

Run your own reverse proxy:

# nginx.conf
server {
  listen 443 ssl;
  server_name my-llm-proxy.com;
  
  location /v1/ {
    proxy_pass https://api.openai.com/v1/;
    proxy_set_header Authorization "Bearer sk-...";
    proxy_set_header Content-Type "application/json";
  }
}

PicoClaw config:

{
  "model_list": [
    {
      "model_name": "gpt4",
      "model": "openai/gpt-4",
      "api_base": "https://my-llm-proxy.com/v1"
    }
  ]
}

Enterprise Deployments

Azure OpenAI

{
  "model_list": [
    {
      "model_name": "azure-gpt4",
      "model": "openai/gpt-4",
      "api_base": "https://your-resource.openai.azure.com/openai/deployments/gpt-4",
      "api_key": "your-azure-key"
    }
  ]
}

AWS Bedrock

Use through LiteLLM proxy:

model_list:
  - model_name: claude-bedrock
    litellm_params:
      model: bedrock/anthropic.claude-3-sonnet-20240229-v1:0
      aws_access_key_id: xxx
      aws_secret_access_key: xxx
      aws_region_name: us-east-1

GCP Vertex AI

Use through LiteLLM proxy:

model_list:
  - model_name: gemini-vertex
    litellm_params:
      model: vertex_ai/gemini-pro
      vertex_project: your-project
      vertex_location: us-central1

Troubleshooting

Connection Refused

Ensure your custom endpoint is running:

curl http://localhost:8000/v1/models

Timeout Errors

Increase timeout for slow endpoints:

{
  "model_name": "slow-model",
  "model": "custom/model",
  "api_base": "http://localhost:8000/v1",
  "request_timeout": 600
}

API Key Issues

Some endpoints don’t require keys:

{
  "model_name": "local-model",
  "model": "ollama/llama3",
  "api_base": "http://localhost:11434/v1"
}

Omit api_key for local servers.

Protocol Mismatches

Ensure your endpoint is OpenAI-compatible:

Endpoint: /v1/chat/completions
Request format: OpenAI JSON schema
Response format: OpenAI JSON schema

Best Practices

Use LiteLLM for complex multi-provider setups
Local development with Ollama for privacy
Production use VLLM for performance
Monitoring add health checks and logging
Security use HTTPS and authentication
Timeouts set appropriate timeouts for your use case
Fallbacks configure backup providers

Example Configurations

Multi-Provider Setup

{
  "model_list": [
    {
      "model_name": "primary",
      "model": "openai/gpt-5.2",
      "api_key": "sk-..."
    },
    {
      "model_name": "fallback",
      "model": "anthropic/claude-sonnet-4.6",
      "api_key": "sk-ant-..."
    },
    {
      "model_name": "local",
      "model": "ollama/llama3",
      "api_base": "http://localhost:11434/v1"
    },
    {
      "model_name": "proxy",
      "model": "litellm/gpt-4",
      "api_base": "http://localhost:4000/v1",
      "api_key": "sk-1234"
    }
  ]
}

Development Setup

{
  "model_list": [
    {
      "model_name": "dev",
      "model": "ollama/llama3",
      "api_base": "http://localhost:11434/v1"
    }
  ],
  "agents": {
    "defaults": {
      "model_name": "dev",
      "max_tokens": 2048
    }
  }
}

Tools API

Provider API

​Overview

​Custom API Endpoints

​Basic Configuration

​Configuration Parameters

​LiteLLM Proxy

​What is LiteLLM?

​Setup LiteLLM

​1. Install LiteLLM

​2. Create Configuration

​3. Start LiteLLM Proxy

​4. Configure PicoClaw

​5. Test Connection

​Advanced LiteLLM Features

​Load Balancing

​VLLM (Self-Hosted)

​What is VLLM?

​Setup VLLM

​1. Install VLLM

​2. Start VLLM Server

​3. Configure PicoClaw

​4. Test Connection

​VLLM with Multiple GPUs

​VLLM Best Practices

​Ollama (Local Models)

​What is Ollama?

​Setup Ollama

​1. Install Ollama

​2. Pull a Model

​3. Start Ollama Server

​4. Configure PicoClaw

​5. Test Connection

​Ollama Best Practices

​Custom Proxy Configuration

​HTTP Proxy

​Reverse Proxy

​Enterprise Deployments

​Azure OpenAI

​AWS Bedrock

​GCP Vertex AI

​Troubleshooting

​Connection Refused

​Timeout Errors

​API Key Issues

​Protocol Mismatches

​Best Practices

​Example Configurations

​Multi-Provider Setup

​Development Setup

​Related Documentation

Build docs developers (and LLMs) love

Overview

Custom API Endpoints

Basic Configuration

Configuration Parameters

LiteLLM Proxy

What is LiteLLM?

Setup LiteLLM

1. Install LiteLLM

2. Create Configuration

3. Start LiteLLM Proxy

4. Configure PicoClaw

5. Test Connection

Advanced LiteLLM Features

Load Balancing

VLLM (Self-Hosted)

What is VLLM?

Setup VLLM

1. Install VLLM

2. Start VLLM Server

3. Configure PicoClaw

4. Test Connection

VLLM with Multiple GPUs

VLLM Best Practices

Ollama (Local Models)

What is Ollama?

Setup Ollama

1. Install Ollama

2. Pull a Model

3. Start Ollama Server

4. Configure PicoClaw

5. Test Connection

Ollama Best Practices

Custom Proxy Configuration

HTTP Proxy

Reverse Proxy

Enterprise Deployments

Azure OpenAI

AWS Bedrock

GCP Vertex AI

Troubleshooting

Connection Refused

Timeout Errors

API Key Issues

Protocol Mismatches

Best Practices

Example Configurations

Multi-Provider Setup

Development Setup

Related Documentation