OpenAI API Proxy

This is only helpful for self-hosted users. If you’re using Khoj Cloud, you’re limited to our first-party models.

Khoj natively supports local LLMs available on HuggingFace in GGUF format. Using an OpenAI API proxy with Khoj may be useful for ease of setup, trying new models or using commercial LLMs via API.

Overview

Khoj can use any OpenAI API compatible server including:

Local Providers

Ollama
LMStudio (currently unsupported)
LiteLLM

Commercial Providers

HuggingFace
OpenRouter
And many more OpenAI-compatible services

Configuring this allows you to use non-standard, open or commercial, local or hosted LLM models for Khoj.

Combine them with Khoj to turn your favorite LLM into an AI agent. Allowing you to chat with your docs, find answers from the internet, build custom agents and run automations.

Specific Integrations

For specific integrations, see our dedicated setup guides:

Ollama

Run local open-source LLMs

LiteLLM

Unified proxy for multiple LLM providers

LMStudio

GUI for local LLMs (unsupported)

General Setup

For general instructions to setup Khoj with any OpenAI API proxy:

Start API Server

Start your preferred OpenAI API compatible app locally or get API keys from commercial AI model providers.

Create AI Model API

Create a new AI Model API on your Khoj admin panel:

Name: any name
Api Key: any string (or your actual API key for commercial providers)
Api Base Url: The URL of your OpenAI Compatible API

Create Chat Model

Create a new Chat Model on your Khoj admin panel:

Name: llama3 (replace with the name of your model)
Model Type: Openai
Ai Model Api: The AI Model API you created in step 2
Max prompt size: 2000 (replace with the max prompt size of your model)
Tokenizer: Do not set for OpenAI, Mistral, Llama3 based models

Select Model

Go to your config and select the model you just created in the chat model dropdown.

Configuration Examples

# Example: Using a local OpenAI-compatible server
API_BASE_URL=http://localhost:8000/v1
API_KEY=dummy-key

Ensure your API proxy supports the OpenAI API format, including the /v1/chat/completions endpoint. Not all proxies support all OpenAI features like JSON mode or function calling.

Google Vertex AI

Tailscale

⌘I

Get Started

Features

Clients

Data Sources

Advanced

OpenAI API Proxy

Overview

Local Providers

Commercial Providers

Specific Integrations

Ollama

LiteLLM

LMStudio

General Setup

Configuration Examples

Build docs developers (and LLMs) love

Get Started

Features

Clients

Data Sources

Advanced

​Overview

​Local Providers

​Commercial Providers

​Specific Integrations

Ollama

LiteLLM

LMStudio

​General Setup

​Configuration Examples

Build docs developers (and LLMs) love

Overview

Local Providers

Commercial Providers

Specific Integrations

General Setup

Configuration Examples