LLMProvider trait. This guide shows you how to integrate and configure different providers.
Supported Providers
MoFA natively supports:- OpenAI (GPT-4, GPT-4 Turbo, GPT-3.5)
- Anthropic (Claude 3.5 Sonnet, Claude 3 Opus, Claude 3 Haiku)
- Ollama (Local models: Llama 3, Mistral, etc.)
- Google Gemini (Gemini Pro, Gemini Flash)
- Any OpenAI-compatible API (vLLM, LocalAI, OpenRouter)
OpenAI Provider
The most common provider for production use.Basic Setup
Environment Variables
.env
Available Models
| Model | Context Window | Best For |
|---|---|---|
gpt-4o | 128K tokens | Most capable, vision support |
gpt-4o-mini | 128K tokens | Faster, cost-effective |
gpt-4-turbo | 128K tokens | High quality with vision |
gpt-3.5-turbo | 16K tokens | Fast and economical |
Usage Example
Anthropic Provider
Claude models excel at long-form reasoning and analysis.Basic Setup
Environment Variables
.env
Available Models
| Model | Context Window | Best For |
|---|---|---|
claude-3.5-sonnet-20241022 | 200K tokens | Best overall performance |
claude-3-opus-20240229 | 200K tokens | Complex reasoning tasks |
claude-3-haiku-20240307 | 200K tokens | Fast, cost-effective |
Helper Function
Ollama Provider
Run models locally without API costs.Prerequisites
- Install Ollama: https://ollama.ai
- Pull a model:
Basic Setup
Environment Variables
.env
Popular Models
| Model | Size | Best For |
|---|---|---|
llama3.2 | 3B/1B | Fast local inference |
mistral | 7B | General purpose |
codellama | 7B-34B | Code generation |
qwen2.5 | 0.5B-72B | Multilingual tasks |
Helper Function
Google Gemini Provider
Access Google’s Gemini models.Basic Setup
Environment Variables
.env
Available Models
| Model | Context Window | Best For |
|---|---|---|
gemini-1.5-pro-latest | 1M tokens | Long context tasks |
gemini-1.5-flash-latest | 1M tokens | Fast, cost-effective |
Helper Function
OpenAI-Compatible Providers
Use any OpenAI-compatible API (vLLM, LocalAI, OpenRouter).vLLM Example
OpenRouter Example
.env
Advanced LLM Client Usage
Streaming Responses
Multi-turn Conversation
Tool Calling
JSON Mode
Provider Comparison
| Feature | OpenAI | Anthropic | Ollama | Gemini |
|---|---|---|---|---|
| Streaming | ✅ | ✅ | ✅ | ✅ |
| Tools | ✅ | ✅ | ✅ | ⚠️ Limited |
| Vision | ✅ | ✅ | ✅ | ✅ |
| Cost | $$$ | $$$ | Free | $$ |
| Privacy | Cloud | Cloud | Local | Cloud |
| Max Context | 128K | 200K | Varies | 1M |
Best Practices
Use environment variables for API keys
Use environment variables for API keys
Never hardcode API keys in your source code. Always use environment variables or a secure secret management system.
Handle errors gracefully
Handle errors gracefully
LLM calls can fail for many reasons (network issues, rate limits, invalid requests). Always handle errors properly.
Choose the right model for your use case
Choose the right model for your use case
- GPT-4o: Best for complex reasoning, vision tasks
- GPT-4o-mini: Fast, cost-effective for simple tasks
- Claude 3.5 Sonnet: Excellent for long-form content, analysis
- Ollama: Local inference, no API costs, privacy-focused
- Gemini Flash: Very long context windows (1M tokens)
Monitor token usage and costs
Monitor token usage and costs
Track token consumption to optimize costs:
Next Steps
Agent Lifecycle
Learn about agent state management and lifecycle hooks
Capabilities & State
Master agent capabilities and state patterns