Quickstart

This guide will help you set up the Ollama API Proxy and test it with a real request. You’ll be able to use commercial LLMs like OpenAI, Google Gemini, and OpenRouter with tools that support the Ollama API format.

Prerequisites

Before you begin, ensure you have:

Node.js 18.0.0 or higher OR Bun 1.2.0 or higher
At least one API key from:
- OpenAI
- Google AI Studio (for Gemini)
- OpenRouter

Quick Start with npx/bunx

The fastest way to get started is using npx (Node.js) or bunx (Bun) - no installation required!

Create a configuration directory

Create a dedicated directory for your proxy configuration:

mkdir ollama-proxy-config
cd ollama-proxy-config

Set up environment variables

Create a .env file with your API keys. You only need to include the providers you want to use:

OPENAI_API_KEY=sk-proj-...
GEMINI_API_KEY=AIzaSy...
OPENROUTER_API_KEY=sk-or-v1-...

At least one API key is required. The proxy will exit with an error if no API keys are found.

Optional configuration:

PORT=11434  # Default port (same as Ollama)
OPENROUTER_API_URL=https://openrouter.ai/api/v1  # Custom OpenRouter URL
NODE_ENV=production  # Set to 'production' to disable timestamps in logs

Start the proxy server

Run the proxy using npx or bunx:

npx ollama-api-proxy

You should see output similar to:

✅ Loaded models from models.json
🚀 Ollama Proxy with Streaming running on http://localhost:11434
🔑 Providers: openai, google, openrouter
📋 Available models: gpt-4o-mini, gpt-4.1-mini, gpt-4.1-nano, gpt-4o, gemini-2.5-flash, gemini-2.5-flash-lite, deepseek-r1

Verify the server is running

Open a new terminal and check the server status:

curl http://localhost:11434/api/version

Expected response:

{"version":"1.0.1e"}

List available models:

curl http://localhost:11434/api/tags

This will return all available models based on your configured API keys.

Test with a sample request

Send a test chat completion request:

curl -X POST http://localhost:11434/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Say hello!"
      }
    ],
    "stream": false
  }'

Try different models like gemini-2.5-flash or deepseek-r1 if you have the corresponding API keys configured!

For streaming responses, set "stream": true:

curl -X POST http://localhost:11434/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Write a haiku about coding"
      }
    ],
    "stream": true
  }'

Configure JetBrains AI Assistant

Now that your proxy is running, you can connect JetBrains AI Assistant to use commercial LLMs:

Open JetBrains AI Assistant settings

In your JetBrains IDE:

Go to Settings/Preferences → Tools → AI Assistant
Select Ollama as your provider

Configure the Ollama URL

Set the Ollama server URL to:

http://localhost:11434

Select your model

Choose one of the available models from the dropdown:

gpt-4o-mini - Fast and efficient OpenAI model (recommended for most tasks)
gpt-4o - More capable OpenAI model
gemini-2.5-flash - Google’s fast model with large context
deepseek-r1 - Free reasoning model via OpenRouter

The gpt-4o-mini model offers the best balance of speed and quality for most coding tasks.

Start using AI Assistant

You can now use JetBrains AI Assistant with your chosen commercial LLM! Try:

Code completion and suggestions
Explaining code
Generating tests
Refactoring assistance

Supported API Endpoints

The proxy implements the following Ollama-compatible endpoints:

/api/chat

Chat completion with message history support

/api/generate

Single-turn text generation from a prompt

/api/tags

List all available models

/api/version

Get proxy version information

Next Steps

API Reference

Explore all available API endpoints and parameters

Model Configuration

Customize available models with models.json

Environment Variables

Learn about all configuration options

Installation Methods

Explore alternative installation methods (npm, Docker, etc.)

Troubleshooting

Error: No API keys found

Make sure you have at least one API key set in your .env file:

OPENAI_API_KEY=your_key_here
# OR
GEMINI_API_KEY=your_key_here
# OR
OPENROUTER_API_KEY=your_key_here

The proxy requires at least one valid API key to start.

Error: Model not supported

The requested model doesn’t exist in your configuration. Check available models:

curl http://localhost:11434/api/tags

Make sure you have the required API key for the model’s provider.

Error: Provider not available

The model’s provider doesn’t have a valid API key configured. For example:

gpt-4o-mini requires OPENAI_API_KEY
gemini-2.5-flash requires GEMINI_API_KEY
deepseek-r1 requires OPENROUTER_API_KEY

Port already in use

If port 11434 is already in use (e.g., by Ollama itself), change the port:

.env

PORT=11435

Then update your client configuration to use http://localhost:11435.

Connection refused in JetBrains

Verify the proxy is running: curl http://localhost:11434/api/version
Check that no firewall is blocking the connection
Ensure you’re using the correct URL format: http://localhost:11434 (not https)

Get Started

Installation

Configuration

Guides

Prerequisites

Quick Start with npx/bunx

Configure JetBrains AI Assistant

Supported API Endpoints

/api/chat

/api/generate

/api/tags

/api/version

Next Steps

API Reference

Model Configuration

Environment Variables

Installation Methods

Troubleshooting

Build docs developers (and LLMs) love

Get Started

Installation

Configuration

Guides

​Prerequisites

​Quick Start with npx/bunx

​Configure JetBrains AI Assistant

​Supported API Endpoints

/api/chat

/api/generate

/api/tags

/api/version

​Next Steps

API Reference

Model Configuration

Environment Variables

Installation Methods

​Troubleshooting

Build docs developers (and LLMs) love

Prerequisites

Quick Start with npx/bunx

Configure JetBrains AI Assistant

Supported API Endpoints

Next Steps

Troubleshooting