Overview
The AnthropicProvider implements the LLMProvider trait for Anthropic’s Claude models using the Messages API. It supports Claude 3 series models with streaming capabilities.
Configuration
AnthropicConfig
pub struct AnthropicConfig {
pub api_key: String,
pub base_url: String,
pub version: String,
pub default_model: String,
pub default_max_tokens: u32,
pub default_temperature: f32,
pub timeout_secs: u64,
}
base_url
String
default:"https://api.anthropic.com"
API base URL
version
String
default:"2023-06-01"
API version header value
default_model
String
default:"claude-3.5-sonnet-20241022"
Default model to use
default_max_tokens
u32
default:"4096"
required
Default maximum output tokens (required by Anthropic API)
Default sampling temperature
Request timeout in seconds
Creating a Provider
Basic Usage
use mofa_foundation::llm::AnthropicProvider;
let provider = AnthropicProvider::new("sk-ant-xxx");
From Environment
// Reads ANTHROPIC_API_KEY, ANTHROPIC_MODEL, ANTHROPIC_BASE_URL, ANTHROPIC_VERSION
let provider = AnthropicProvider::from_env();
With Configuration
use mofa_foundation::llm::{AnthropicProvider, AnthropicConfig};
let config = AnthropicConfig::new("sk-ant-xxx")
.with_model("claude-3-opus-20240229")
.with_temperature(0.8)
.with_max_tokens(8192)
.with_timeout(120);
let provider = AnthropicProvider::with_config(config);
Supported Models
Claude 3.5 Sonnet
- claude-3.5-sonnet-20241022: Latest and most capable Claude 3.5 model
- claude-3.5-sonnet-20240620: Previous version
Claude 3 Series
- claude-3-opus-20240229: Most capable Claude 3 model
- claude-3-sonnet-20240229: Balanced performance and speed
- claude-3-haiku-20240307: Fastest and most compact
Claude 2 Series (Legacy)
- claude-2.1: Extended context window
- claude-2.0: Original Claude 2
Model Capabilities
Current implementation supports streaming but does not yet expose tool calling or vision features. These capabilities are planned for future releases.
provider.supports_streaming(); // true
provider.supports_tools(); // false (coming soon)
provider.supports_vision(); // false (coming soon)
provider.supports_embedding(); // false
Features
Streaming Support
The provider implements full SSE (Server-Sent Events) streaming:
use futures::StreamExt;
use mofa_foundation::llm::{LLMClient, AnthropicProvider};
use std::sync::Arc;
let provider = Arc::new(AnthropicProvider::new("sk-ant-xxx"));
let client = LLMClient::new(provider);
let mut stream = client.chat()
.system("You are a creative storyteller.")
.user("Tell me a short story about Rust")
.send_stream()
.await?;
while let Some(chunk) = stream.next().await {
match chunk {
Ok(chunk) => {
if let Some(content) = chunk.content() {
print!("{}", content);
}
if let Some(usage) = chunk.usage {
println!("\nTokens: {} prompt + {} completion",
usage.prompt_tokens, usage.completion_tokens);
}
}
Err(e) => eprintln!("Stream error: {}", e),
}
}
SSE Event Types
The provider correctly parses all Anthropic SSE events:
- message_start: Initial message metadata and role
- content_block_delta: Incremental text content
- message_delta: Final metadata and stop reason
- message_stop: End of stream
System Prompts
Claude uses a separate system parameter:
let response = client.chat()
.system("You are a helpful assistant who speaks like a pirate.")
.user("Hello, how are you?")
.send()
.await?;
Context Windows
Different models have different context windows:
- Claude 3 Opus/Sonnet: 200K tokens
- Claude 3 Haiku: 200K tokens
- Claude 2.1: 200K tokens
- Claude 2.0: 100K tokens
Temperature Control
let response = client.chat()
.user("Write a creative poem")
.temperature(0.9) // More creative
.send()
.await?;
let response = client.chat()
.user("What is 2+2?")
.temperature(0.1) // More deterministic
.send()
.await?;
Stop Sequences
let response = client.chat()
.user("Count to 10")
.stop(vec!["5".to_string()])
.send()
.await?;
Error Handling
The provider maps HTTP errors to LLMError:
use mofa_foundation::llm::LLMError;
match client.chat().user("Test").send().await {
Ok(response) => println!("Success: {}", response.content().unwrap()),
Err(LLMError::RateLimited(msg)) => {
println!("Rate limited: {}", msg);
// Implement backoff
}
Err(LLMError::ApiError { code, message }) => {
println!("API error {}: {}", code.unwrap_or_default(), message);
}
Err(LLMError::Timeout(msg)) => {
println!("Timeout: {}", msg);
// Retry
}
Err(e) => println!("Error: {}", e),
}
Multi-Modal Support (Coming Soon)
While the provider has message conversion logic for images, audio, and video, these features are not yet fully exposed. Future versions will support:
// Vision (coming soon)
let content = MessageContent::Parts(vec![
ContentPart::Text { text: "What's in this image?".to_string() },
ContentPart::Image {
image_url: ImageUrl {
url: "data:image/jpeg;base64,...".to_string(),
detail: None,
},
},
]);
Complete Example
use mofa_foundation::llm::*;
use std::sync::Arc;
use futures::StreamExt;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create provider from environment
let provider = Arc::new(AnthropicProvider::from_env());
let client = LLMClient::new(provider.clone());
println!("Provider: {}", provider.name());
println!("Default model: {}", provider.default_model());
// Simple question
let answer = client.ask_with_system(
"You are a Rust expert.",
"Explain ownership in one sentence."
).await?;
println!("\nAnswer: {}", answer);
// Streaming response
println!("\nStreaming response:");
let mut stream = client.chat()
.system("You are a creative writer.")
.user("Write a haiku about Rust programming.")
.temperature(0.9)
.max_tokens(200)
.send_stream()
.await?;
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
if let Some(content) = chunk.content() {
print!("{}", content);
}
}
println!("\n");
// Multi-turn conversation
let mut session = ChatSession::new(client)
.with_system("You are a helpful assistant with perfect memory.");
let r1 = session.send("My favorite color is blue.").await?;
println!("Bot: {}", r1);
let r2 = session.send("What's my favorite color?").await?;
println!("Bot: {}", r2);
// Model information
let info = provider.get_model_info("claude-3.5-sonnet-20241022").await?;
println!("\nModel info:");
println!(" Name: {}", info.name);
println!(" Max tokens: {:?}", info.max_output_tokens);
println!(" Streaming: {}", info.capabilities.streaming);
Ok(())
}
Token Usage Tracking
Anthropic returns detailed usage information:
let response = client.chat()
.user("Hello, Claude!")
.send()
.await?;
if let Some(usage) = response.usage {
println!("Prompt tokens: {}", usage.prompt_tokens);
println!("Completion tokens: {}", usage.completion_tokens);
println!("Total tokens: {}", usage.total_tokens);
}
Best Practices
- Max Tokens Required: Always set
max_tokens as it’s required by the API
- System Prompts: Use detailed system prompts for better results
- Temperature: Use 0.0-0.3 for factual, 0.7-1.0 for creative
- Streaming: Use streaming for long responses to improve user experience
- Error Handling: Handle rate limits with exponential backoff
Environment Variables
ANTHROPIC_API_KEY: API key (required)
ANTHROPIC_MODEL: Default model name
ANTHROPIC_BASE_URL: Custom base URL
ANTHROPIC_VERSION: API version