Overview
Adist integrates with Ollama to provide AI-driven code analysis using locally-run language models. This option is completely free, private, and doesn’t require an internet connection for inference.Benefits
Free
No API costs - run unlimited queries
Private
Your code never leaves your machine
Offline
Works without internet (after initial setup)
Setup
Install Ollama
Download and install Ollama from ollama.com/download
- macOS
- Linux
- Windows
Download the macOS installer from the website and run it.
Pull a Model
Download a language model. Popular options include:Llama 3 offers excellent code understanding and generation.
- Llama 3 (Recommended)
- CodeLlama
- Mistral
Start Ollama Service
Ensure Ollama is running:
On most systems, Ollama runs as a background service automatically after installation.
Configure Adist
Run the LLM configuration command:Select:
- Ollama as your provider
- Your preferred model from the list of installed models
- Optionally customize the API URL (default:
http://localhost:11434)
Features
Local Model Support
The Ollama service can use any locally installed model:Context Caching
The Ollama service includes intelligent context caching:- Topic Identification: Automatically identifies query topics
- Cache Duration: Contexts are cached for 30 minutes
- Cache Cleanup: Old entries are automatically removed
Context merging is simpler in Ollama compared to cloud providers due to smaller context windows.
Query Complexity Estimation
Queries are analyzed and categorized as:- Low Complexity: Simple questions (< 8 words, no technical terms)
- Medium Complexity: Standard questions (8-15 words or basic technical terms)
- High Complexity: Complex questions (> 15 words, code snippets, comparisons)
Streaming Support
Ollama supports real-time streaming responses:Code Reference
The Ollama service is implemented in/home/daytona/workspace/source/src/utils/ollama.ts:20
Key Methods
isAvailable
Checks if Ollama is running:listModels
Returns all locally installed models:summarizeFile
Generates summaries of individual files:generateOverallSummary
Creates a project overview from file summaries:queryProject
Answers questions about your project:chatWithProject
Enables conversational interactions:Configuration Options
Context Limits
- Maximum Context Length: 30,000 characters (lower than cloud providers)
- Cache Timeout: 30 minutes
- Dynamic Adjustment: Context size varies based on query complexity
Custom API URL
If you’re running Ollama on a different host or port:Model Selection
Different models have different characteristics:- Small Models (< 10GB)
- Medium Models (10-40GB)
- Code-Specialized
- phi3: Fast, good for simple queries
- llama3:8b: Balanced performance
- mistral: General purpose
Performance Optimization
Hardware Requirements
- Minimum
- Recommended
- Optimal
- RAM: 8GB
- GPU: Optional (CPU-only works)
- Storage: 5GB for small models
GPU Acceleration
Ollama automatically uses GPU acceleration when available:- NVIDIA GPUs: CUDA support (recommended)
- Apple Silicon: Metal support
- AMD GPUs: ROCm support (Linux)
GPU acceleration can be 10-100x faster than CPU-only inference.
Cost Comparison
Ollama is completely free:- API Costs: $0 (no API calls)
- Inference: Free unlimited usage
- Storage: Only disk space for models
- vs Anthropic
- vs OpenAI
Example: 1000 queries
- Ollama: $0
- Anthropic (Claude Sonnet): ~$3-10
Best Practices
- Model Selection
- Performance
- Quality
- Start with
llama3for balanced performance - Use
codellamafor code-heavy projects - Try smaller models first if hardware is limited
- Experiment with different models for your use case
Troubleshooting
Ollama Not Running
If you see connection errors:No Models Available
If no models appear during configuration:Slow Responses
- Use a smaller model (e.g.,
llama3:8binstead ofllama3:70b) - Enable GPU acceleration
- Reduce context complexity
- Close other applications
Out of Memory
- Switch to a smaller model
- Reduce the number of concurrent queries
- Increase system swap space
- Use CPU instead of GPU if VRAM is limited
Poor Response Quality
- Try a larger or specialized model
- Ensure project is properly indexed
- Use more specific queries
- Generate file summaries for better context
Privacy and Security
Privacy Benefits:- Code never sent to external APIs
- No data collection or telemetry
- Complete control over your data
- Suitable for sensitive or proprietary code
Advanced Configuration
Custom Model Parameters
You can customize model behavior by creating a Modelfile:my-code-assistant in adist llm-config.
Running on Remote Server
To use Ollama running on another machine:- Configure Ollama to accept remote connections
- Update the API URL in
adist llm-config - Ensure proper network security (VPN, firewall, etc.)
Next Steps
Start Querying
Ask questions about your codebase
Start Chatting
Have conversations about your project