Ollama API Proxy

Use JetBrains AI Assistant with OpenAI, Google Gemini, Deepseek R1, and Kimi K2 — taking advantage of their free tier APIs.

Get Started View on GitHub

What is Ollama API Proxy?

Ollama API Proxy is a translation layer that allows you to use commercial LLM providers with tools that support the Ollama API format. It runs on port 11434 by default and seamlessly converts requests from Ollama format to OpenAI, Google Gemini, or OpenRouter API formats. This enables you to leverage powerful commercial models — including free-tier offerings like Deepseek R1 and Kimi K2 — directly within JetBrains AI Assistant and other Ollama-compatible tools.

Key Features

Multi-Provider Support

Connect to OpenAI, Google Gemini, and OpenRouter — all through a single proxy server

Ollama Compatibility

Drop-in replacement for Ollama server with full API compatibility

Free Tier Access

Take advantage of free-tier APIs from Deepseek R1 and Kimi K2

Streaming Support

Full support for streaming and non-streaming responses

Vision Support

Process images with vision-enabled models

Custom Models

Configure custom model mappings via JSON

How It Works

The proxy server translates between two API formats:

Ollama API — Used by JetBrains AI Assistant and other tools
Provider APIs — OpenAI, Google Gemini, OpenRouter

When JetBrains AI Assistant sends a request to http://localhost:11434/api/chat, the proxy:

Receives the request in Ollama format
Identifies the target provider based on the model name
Translates the request to the provider’s API format
Forwards the request to OpenAI, Gemini, or OpenRouter
Translates the response back to Ollama format
Returns it to JetBrains AI Assistant

Use Cases

JetBrains AI Assistant Integration

Use commercial LLMs like GPT-4o, Gemini 2.5 Flash, or Deepseek R1 directly within your IDE for code completion, chat, and refactoring.

Free Tier API Usage

Access powerful models like Deepseek R1 and Kimi K2 through their free-tier APIs without paying for commercial subscriptions.

Multi-Model Development

Test your application against multiple LLM providers without changing your integration code.

Vision-Enabled Workflows

Send images to vision-capable models for analysis, OCR, and visual understanding tasks.

Quick Example

Once the proxy is running, configure JetBrains AI Assistant to point to http://localhost:11434 and select a model:

# Start the proxy with environment variables
export OPENAI_API_KEY="sk-..."
export GEMINI_API_KEY="..."
export OPENROUTER_API_KEY="sk-or-..."

npx ollama-api-proxy

Then in JetBrains AI Assistant:

Open Settings → Tools → AI Assistant
Select Ollama as the provider
Set server URL to http://localhost:11434
Choose a model like gpt-4o-mini, gemini-2.5-flash, or deepseek-r1

Next Steps

Quickstart

Get up and running in 5 minutes

Installation

Choose your installation method

Configuration

Configure API keys and custom models

API Reference

Explore available endpoints

Get Started

Installation

Configuration

Guides

Introduction

Ollama API Proxy

What is Ollama API Proxy?

Key Features

Multi-Provider Support

Ollama Compatibility

Free Tier Access

Streaming Support

Vision Support

Custom Models

How It Works

Use Cases

Quick Example

Next Steps

Quickstart

Installation

Configuration

API Reference

Build docs developers (and LLMs) love

Get Started

Installation

Configuration

Guides

​Ollama API Proxy

​What is Ollama API Proxy?

​Key Features

Multi-Provider Support

Ollama Compatibility

Free Tier Access

Streaming Support

Vision Support

Custom Models

​How It Works

​Use Cases

​Quick Example

​Next Steps

Quickstart

Installation

Configuration

API Reference

Build docs developers (and LLMs) love

Ollama API Proxy

What is Ollama API Proxy?

Key Features

How It Works

Use Cases

Quick Example

Next Steps