Skip to main content

Overview

The AnthropicProvider implements the LLMProvider trait for Anthropic’s Claude models using the Messages API. It supports Claude 3 series models with streaming capabilities.

Configuration

AnthropicConfig

pub struct AnthropicConfig {
    pub api_key: String,
    pub base_url: String,
    pub version: String,
    pub default_model: String,
    pub default_max_tokens: u32,
    pub default_temperature: f32,
    pub timeout_secs: u64,
}
api_key
String
required
Anthropic API key
base_url
String
default:"https://api.anthropic.com"
API base URL
version
String
default:"2023-06-01"
API version header value
default_model
String
default:"claude-3.5-sonnet-20241022"
Default model to use
default_max_tokens
u32
default:"4096"
required
Default maximum output tokens (required by Anthropic API)
default_temperature
f32
default:"0.7"
Default sampling temperature
timeout_secs
u64
default:"60"
Request timeout in seconds

Creating a Provider

Basic Usage

use mofa_foundation::llm::AnthropicProvider;

let provider = AnthropicProvider::new("sk-ant-xxx");

From Environment

// Reads ANTHROPIC_API_KEY, ANTHROPIC_MODEL, ANTHROPIC_BASE_URL, ANTHROPIC_VERSION
let provider = AnthropicProvider::from_env();

With Configuration

use mofa_foundation::llm::{AnthropicProvider, AnthropicConfig};

let config = AnthropicConfig::new("sk-ant-xxx")
    .with_model("claude-3-opus-20240229")
    .with_temperature(0.8)
    .with_max_tokens(8192)
    .with_timeout(120);

let provider = AnthropicProvider::with_config(config);

Supported Models

Claude 3.5 Sonnet

  • claude-3.5-sonnet-20241022: Latest and most capable Claude 3.5 model
  • claude-3.5-sonnet-20240620: Previous version

Claude 3 Series

  • claude-3-opus-20240229: Most capable Claude 3 model
  • claude-3-sonnet-20240229: Balanced performance and speed
  • claude-3-haiku-20240307: Fastest and most compact

Claude 2 Series (Legacy)

  • claude-2.1: Extended context window
  • claude-2.0: Original Claude 2

Model Capabilities

Current implementation supports streaming but does not yet expose tool calling or vision features. These capabilities are planned for future releases.
provider.supports_streaming(); // true
provider.supports_tools();     // false (coming soon)
provider.supports_vision();    // false (coming soon)
provider.supports_embedding(); // false

Features

Streaming Support

The provider implements full SSE (Server-Sent Events) streaming:
use futures::StreamExt;
use mofa_foundation::llm::{LLMClient, AnthropicProvider};
use std::sync::Arc;

let provider = Arc::new(AnthropicProvider::new("sk-ant-xxx"));
let client = LLMClient::new(provider);

let mut stream = client.chat()
    .system("You are a creative storyteller.")
    .user("Tell me a short story about Rust")
    .send_stream()
    .await?;

while let Some(chunk) = stream.next().await {
    match chunk {
        Ok(chunk) => {
            if let Some(content) = chunk.content() {
                print!("{}", content);
            }
            if let Some(usage) = chunk.usage {
                println!("\nTokens: {} prompt + {} completion", 
                    usage.prompt_tokens, usage.completion_tokens);
            }
        }
        Err(e) => eprintln!("Stream error: {}", e),
    }
}

SSE Event Types

The provider correctly parses all Anthropic SSE events:
  • message_start: Initial message metadata and role
  • content_block_delta: Incremental text content
  • message_delta: Final metadata and stop reason
  • message_stop: End of stream

System Prompts

Claude uses a separate system parameter:
let response = client.chat()
    .system("You are a helpful assistant who speaks like a pirate.")
    .user("Hello, how are you?")
    .send()
    .await?;

Context Windows

Different models have different context windows:
  • Claude 3 Opus/Sonnet: 200K tokens
  • Claude 3 Haiku: 200K tokens
  • Claude 2.1: 200K tokens
  • Claude 2.0: 100K tokens

Temperature Control

let response = client.chat()
    .user("Write a creative poem")
    .temperature(0.9)  // More creative
    .send()
    .await?;

let response = client.chat()
    .user("What is 2+2?")
    .temperature(0.1)  // More deterministic
    .send()
    .await?;

Stop Sequences

let response = client.chat()
    .user("Count to 10")
    .stop(vec!["5".to_string()])
    .send()
    .await?;

Error Handling

The provider maps HTTP errors to LLMError:
use mofa_foundation::llm::LLMError;

match client.chat().user("Test").send().await {
    Ok(response) => println!("Success: {}", response.content().unwrap()),
    Err(LLMError::RateLimited(msg)) => {
        println!("Rate limited: {}", msg);
        // Implement backoff
    }
    Err(LLMError::ApiError { code, message }) => {
        println!("API error {}: {}", code.unwrap_or_default(), message);
    }
    Err(LLMError::Timeout(msg)) => {
        println!("Timeout: {}", msg);
        // Retry
    }
    Err(e) => println!("Error: {}", e),
}

Multi-Modal Support (Coming Soon)

While the provider has message conversion logic for images, audio, and video, these features are not yet fully exposed. Future versions will support:
// Vision (coming soon)
let content = MessageContent::Parts(vec![
    ContentPart::Text { text: "What's in this image?".to_string() },
    ContentPart::Image {
        image_url: ImageUrl {
            url: "data:image/jpeg;base64,...".to_string(),
            detail: None,
        },
    },
]);

Complete Example

use mofa_foundation::llm::*;
use std::sync::Arc;
use futures::StreamExt;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create provider from environment
    let provider = Arc::new(AnthropicProvider::from_env());
    let client = LLMClient::new(provider.clone());

    println!("Provider: {}", provider.name());
    println!("Default model: {}", provider.default_model());

    // Simple question
    let answer = client.ask_with_system(
        "You are a Rust expert.",
        "Explain ownership in one sentence."
    ).await?;
    println!("\nAnswer: {}", answer);

    // Streaming response
    println!("\nStreaming response:");
    let mut stream = client.chat()
        .system("You are a creative writer.")
        .user("Write a haiku about Rust programming.")
        .temperature(0.9)
        .max_tokens(200)
        .send_stream()
        .await?;

    while let Some(chunk) = stream.next().await {
        let chunk = chunk?;
        if let Some(content) = chunk.content() {
            print!("{}", content);
        }
    }
    println!("\n");

    // Multi-turn conversation
    let mut session = ChatSession::new(client)
        .with_system("You are a helpful assistant with perfect memory.");

    let r1 = session.send("My favorite color is blue.").await?;
    println!("Bot: {}", r1);

    let r2 = session.send("What's my favorite color?").await?;
    println!("Bot: {}", r2);

    // Model information
    let info = provider.get_model_info("claude-3.5-sonnet-20241022").await?;
    println!("\nModel info:");
    println!("  Name: {}", info.name);
    println!("  Max tokens: {:?}", info.max_output_tokens);
    println!("  Streaming: {}", info.capabilities.streaming);

    Ok(())
}

Token Usage Tracking

Anthropic returns detailed usage information:
let response = client.chat()
    .user("Hello, Claude!")
    .send()
    .await?;

if let Some(usage) = response.usage {
    println!("Prompt tokens: {}", usage.prompt_tokens);
    println!("Completion tokens: {}", usage.completion_tokens);
    println!("Total tokens: {}", usage.total_tokens);
}

Best Practices

  1. Max Tokens Required: Always set max_tokens as it’s required by the API
  2. System Prompts: Use detailed system prompts for better results
  3. Temperature: Use 0.0-0.3 for factual, 0.7-1.0 for creative
  4. Streaming: Use streaming for long responses to improve user experience
  5. Error Handling: Handle rate limits with exponential backoff

Environment Variables

  • ANTHROPIC_API_KEY: API key (required)
  • ANTHROPIC_MODEL: Default model name
  • ANTHROPIC_BASE_URL: Custom base URL
  • ANTHROPIC_VERSION: API version

Build docs developers (and LLMs) love