Overview
- Type: Local provider
- Cost: Free
- API Key Required: No
- Installation Required: Yes
- Official Website: https://ollama.com/
Prerequisites
Install Ollama
Download and install Ollama from ollama.com.
Pull a model
Open your terminal and pull a model from the Ollama library. For example:Other popular models:
llama3.2- Fast and capablephi3- Smaller, efficient modelmistral- High-quality general purposeqwen2.5- Excellent multilingual support
Setup in AI Providers
Select Ollama provider
In the AI Providers settings, click Create AI provider and select
Ollama as the provider type.Configure URL (optional)
The default URL is
http://localhost:11434. Only change this if you’re running Ollama on a different host or port.Select model
Click the refresh button next to the model field to fetch available models, then select the model you pulled (e.g.,
gemma2).Recommended Models
| Model | Size | Best For |
|---|---|---|
gemma2:2b | ~1.6GB | Fast responses, lower RAM usage |
llama3.2 | ~2GB | Balanced performance |
qwen2.5:7b | ~4.7GB | High quality, multilingual |
mistral | ~4.1GB | General purpose tasks |
llama3.1:70b | ~40GB | Best quality (requires powerful hardware) |
Troubleshooting
Streaming Issues
If you experience problems with streaming completions, set theOLLAMA_ORIGINS environment variable to allow cross-origin requests:
macOS:
~/.bashrc or ~/.zshrc to make it permanent.
Windows:
Connection Refused
Model Not Found
If the model doesn’t appear in the list:- Make sure you’ve pulled the model using
ollama pull <model-name> - Click the refresh button in AI Providers settings
- Verify Ollama is running by visiting
http://localhost:11434in your browser
Open WebUI Integration
Ollama works seamlessly with Open WebUI. If you’re using Open WebUI, selectOllama (Open WebUI) as your provider type instead and configure it with your Open WebUI URL.
Advanced Configuration
You can pass custom options to Ollama models by using the options parameter in API calls. Common options include:temperature- Controls randomness (0.0 to 2.0)num_ctx- Context window size (automatically optimized by AI Providers)top_p- Nucleus sampling parameterrepeat_penalty- Penalize repetition
AI Providers automatically optimizes the context window (
num_ctx) based on your input length, so you typically don’t need to set this manually.