OllamaEmbeddings class provides integration with locally-hosted Ollama models for privacy-focused, offline embedding generation.
Installation
Install Ollama
First, install and set up Ollama on your system:Pull an embedding model
Install LangChain integration
Usage
Start Ollama server
Basic usage
Embed single text
Embed multiple texts
Async usage
Configuration
Supported models
Popular embedding models on Ollama:llama3- Meta’s Llama 3 modelnomic-embed-text- Nomic’s text embedding modelmxbai-embed-large- MixedBread.ai’s large embedding modelall-minilm- Sentence-transformers MiniLM model
Custom base URL
Connect to Ollama running on a different host:Authentication
For Ollama behind a proxy:Model parameters
Configure sampling and performance:Keep-alive
Control how long the model stays loaded:Validate model on init
Check if model exists locally before using:Parameters
Name of the Ollama model to use.
Base URL where Ollama is hosted. Defaults to Ollama client default (usually
http://localhost:11434).Whether to validate the model exists in Ollama locally on initialization.
Sampling temperature. Higher values make output more creative.
Size of the context window.
Number of GPUs to use. Defaults to 1 on macOS (for Metal support), 0 to disable.
Number of threads for computation. Defaults to optimal performance based on system.
How long (in seconds) the model stays loaded in memory. Defaults to 300 seconds (5 minutes).
Additional kwargs to pass to httpx client (e.g., headers).