Skip to main content
Ollama is a free, open-source tool that lets you run large language models locally on your computer. It’s perfect for users who want maximum privacy and offline access to AI capabilities.

Overview

  • Type: Local provider
  • Cost: Free
  • API Key Required: No
  • Installation Required: Yes
  • Official Website: https://ollama.com/

Prerequisites

1

Install Ollama

Download and install Ollama from ollama.com.
2

Pull a model

Open your terminal and pull a model from the Ollama library. For example:
ollama pull gemma2
Other popular models:
  • llama3.2 - Fast and capable
  • phi3 - Smaller, efficient model
  • mistral - High-quality general purpose
  • qwen2.5 - Excellent multilingual support

Setup in AI Providers

1

Select Ollama provider

In the AI Providers settings, click Create AI provider and select Ollama as the provider type.
2

Configure URL (optional)

The default URL is http://localhost:11434. Only change this if you’re running Ollama on a different host or port.
3

Select model

Click the refresh button next to the model field to fetch available models, then select the model you pulled (e.g., gemma2).
4

Test the provider

Click Test to verify the connection is working properly.
ModelSizeBest For
gemma2:2b~1.6GBFast responses, lower RAM usage
llama3.2~2GBBalanced performance
qwen2.5:7b~4.7GBHigh quality, multilingual
mistral~4.1GBGeneral purpose tasks
llama3.1:70b~40GBBest quality (requires powerful hardware)

Troubleshooting

Streaming Issues

If you experience problems with streaming completions, set the OLLAMA_ORIGINS environment variable to allow cross-origin requests: macOS:
launchctl setenv OLLAMA_ORIGINS "*"
Linux:
export OLLAMA_ORIGINS="*"
Add this to your ~/.bashrc or ~/.zshrc to make it permanent. Windows:
setx OLLAMA_ORIGINS "*"
For more details, see the Ollama FAQ.

Connection Refused

If you see “connection refused” errors, make sure Ollama is running. You can start it by running ollama serve in your terminal or by opening the Ollama application.

Model Not Found

If the model doesn’t appear in the list:
  1. Make sure you’ve pulled the model using ollama pull <model-name>
  2. Click the refresh button in AI Providers settings
  3. Verify Ollama is running by visiting http://localhost:11434 in your browser

Open WebUI Integration

Ollama works seamlessly with Open WebUI. If you’re using Open WebUI, select Ollama (Open WebUI) as your provider type instead and configure it with your Open WebUI URL.

Advanced Configuration

You can pass custom options to Ollama models by using the options parameter in API calls. Common options include:
  • temperature - Controls randomness (0.0 to 2.0)
  • num_ctx - Context window size (automatically optimized by AI Providers)
  • top_p - Nucleus sampling parameter
  • repeat_penalty - Penalize repetition
AI Providers automatically optimizes the context window (num_ctx) based on your input length, so you typically don’t need to set this manually.

Build docs developers (and LLMs) love