Overview
TheLLMProvider trait defines the standard interface for integrating Large Language Model APIs into the MoFA framework. All LLM providers (OpenAI, Anthropic, Ollama, etc.) implement this trait to provide a unified API surface.
Trait Definition
Methods
Provider Metadata
Returns the provider name identifier (e.g., “openai”, “anthropic”, “ollama”)
Returns the default model identifier used when no model is specified
Returns a list of model identifiers supported by this provider
Capability Detection
Returns
true if the provider supports streaming responsesReturns
true if the provider supports function/tool callingReturns
true if the provider supports vision/image inputsReturns
true if the provider supports text embeddingsCore Operations
Sends a chat completion request and returns the complete responseParameters:
request: Chat completion request with messages, model, and parameters
ChatCompletionResponse: Complete response with choices and usage data
Sends a chat completion request and returns a stream of response chunksParameters:
request: Chat completion request withstreamenabled
ChatStream: Stream ofChatCompletionChunkitems
Generates embeddings for input text(s)Parameters:
request: Embedding request with model and input text(s)
EmbeddingResponse: Vector embeddings and usage data
Checks if the provider API is accessible and respondingReturns:
bool:trueif healthy,falseotherwise
Retrieves metadata about a specific modelParameters:
model: Model identifier
ModelInfo: Model capabilities, context window, and metadata
Type Definitions
ModelInfo
ModelCapabilities
ChatStream
Implementing a Custom Provider
Basic Implementation
Using the Custom Provider
Built-in Providers
MoFA includes several built-in provider implementations:- OpenAI: GPT-4, GPT-3.5, with vision and tools support
- Anthropic: Claude 3 models with streaming
- Ollama: Local models via OpenAI-compatible API
- Google Gemini: Google’s models (when enabled)
Error Handling
Best Practices
- Thread Safety: All providers must be
Send + Syncfor concurrent usage - Error Categorization: Map provider-specific errors to appropriate
LLMErrorvariants - Timeout Handling: Implement reasonable timeouts for API calls
- Retry Logic: Consider implementing retry logic for transient failures
- Capability Detection: Accurately report supported capabilities
- Resource Cleanup: Properly handle connection pooling and cleanup
Related Types
- LLMClient - High-level client for using providers
- ChatCompletionRequest - Request format
- ChatCompletionResponse - Response format
- Tool - Tool/function calling definitions