Skip to main content

Base URL

All API requests should be made to:
http://localhost:11434
The port can be configured using the PORT environment variable (defaults to 11434).

Authentication

The Ollama API Proxy does not require authentication. However, you must configure provider API keys (OpenAI, Google Gemini, or OpenRouter) via environment variables for the proxy to function.

Available Endpoints

The Ollama API Proxy provides the following endpoints:

Health Check

GET / Simple health check endpoint that returns a plain text message indicating the proxy is running.
curl http://localhost:11434/
Response:
Ollama is running in proxy mode.
This endpoint is useful for:
  • Verifying the proxy server is running
  • Basic connectivity tests
  • Health checks in orchestration systems

Chat Generation

POST /api/chat Generate chat completions with conversation history. Supports both streaming and non-streaming responses. View full documentation →

Text Generation

POST /api/generate Generate text completions from a single prompt. Supports both streaming and non-streaming responses. View full documentation →

List Models

GET /api/tags Retrieve a list of available models based on configured providers. View full documentation →

Version Info

GET /api/version Get the current version of the Ollama API Proxy. View full documentation →

Response Format

All endpoints return JSON responses with appropriate HTTP status codes:
  • 200: Success
  • 404: Endpoint not found
  • 500: Internal server error

CORS Support

All endpoints include CORS headers to allow cross-origin requests:
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, OPTIONS
Access-Control-Allow-Headers: Content-Type

Streaming

Streaming responses use the application/x-ndjson content type and return newline-delimited JSON objects. Each chunk contains a partial response, with the final chunk marked by done: true.

Build docs developers (and LLMs) love