Overview
The Anthropic provider (AnthropicLlm) provides access to Claude 3 and Claude 4 models with advanced features including prompt caching, extended context windows (up to 200K tokens), and sophisticated streaming.
Source: packages/adk/src/models/anthropic-llm.ts:34
Supported Models
The Anthropic provider matches these model patterns:
// From anthropic-llm.ts:45-47
static override supportedModels (): string [] {
return [ "claude-3-.*" , "claude-.*-4.*" ];
}
Model Examples
Claude 3.5 Sonnet
Claude 3 Opus
Claude 3 Haiku
import { AgentBuilder } from '@iqai/adk' ;
const agent = AgentBuilder
. withModel ( 'claude-3-5-sonnet-20241022' )
. build ();
Best for: Most intelligent model, best for complex tasks
Context: 200K tokens
Output: 8K tokens
Prompt caching: Yes
const agent = AgentBuilder
. withModel ( 'claude-3-opus-20240229' )
. build ();
Best for: Highest intelligence, complex reasoning
Context: 200K tokens
Output: 4K tokens
const agent = AgentBuilder
. withModel ( 'claude-3-haiku-20240307' )
. build ();
Best for: Fast responses, cost-effective
Context: 200K tokens
Output: 4K tokens
Configuration
API Key Setup
Set your Anthropic API key:
ANTHROPIC_API_KEY = sk-ant-...
The provider automatically reads from environment:
// From anthropic-llm.ts:500-510
private get client (): Anthropic {
if ( ! this . _client ) {
const apiKey = process . env . ANTHROPIC_API_KEY ;
if ( ! apiKey )
throw new Error (
"ANTHROPIC_API_KEY environment variable is required for Anthropic models"
);
this . _client = new Anthropic ({ apiKey });
}
return this . _client ;
}
Basic Usage
import { AgentBuilder } from '@iqai/adk' ;
const agent = AgentBuilder
. withModel ( 'claude-3-5-sonnet-20241022' )
. withInstruction ( 'You are a helpful assistant' )
. build ();
const response = await agent . ask ( 'Explain quantum computing' );
console . log ( response . text );
Configuration Options
model
string
default: "claude-3-5-sonnet-20241022"
The Anthropic model to use
Maximum tokens to generate (required by Anthropic)
Controls randomness (0.0 - 1.0). Lower is more focused
Nucleus sampling parameter (0.0 - 1.0)
const agent = AgentBuilder
. withModel ( 'claude-3-5-sonnet-20241022' )
. withConfig ({
maxOutputTokens: 4096 ,
temperature: 0.7 ,
topP: 0.9
})
. build ();
Prompt Caching
Anthropic’s prompt caching can reduce costs by up to 90% and latency by up to 85% for requests with repeated context.
Enable Caching
import { AgentBuilder } from '@iqai/adk' ;
const agent = AgentBuilder
. withModel ( 'claude-3-5-sonnet-20241022' )
. withInstruction ( 'You are a code review assistant...' )
. withCacheConfig ({
ttlSeconds: 3600 // 1 hour
})
. build ();
Cache TTL Options
Anthropic supports two TTL durations:
5 Minutes (Default)
1 Hour
. withCacheConfig ({ ttlSeconds: 300 })
Use for: Short-lived conversations, testing
Cost: Lower cache creation cost
Best for: Quick interactions
. withCacheConfig ({ ttlSeconds: 3600 })
Use for: Long sessions, repeated queries
Cost: Higher cache creation, bigger savings
Best for: Extended conversations, batch processing
The provider automatically selects the TTL:
// From anthropic-llm.ts:309-315
private getCacheTTL ( llmRequest : LlmRequest ): "5m" | "1h" {
if ( ! llmRequest . cacheConfig ) return "5m" ;
return llmRequest . cacheConfig . ttlSeconds >
ANTHROPIC_CACHE_LONG_TTL_THRESHOLD // 1800 seconds (30 min)
? "1h"
: "5m" ;
}
The provider logs cache hits and creation:
// From anthropic-llm.ts:328-345
private logCachePerformance ( usage : Anthropic . Messages . Usage ): void {
const cacheRead = usage . cache_read_input_tokens || 0 ;
const cacheCreation = usage . cache_creation_input_tokens || 0 ;
if ( cacheRead > 0 ) {
this . logger . info ( `Cache HIT: ${ cacheRead } tokens read from cache` );
}
if ( cacheCreation > 0 ) {
this . logger . info (
`Cache CREATED: ${ cacheCreation } tokens written to cache` ,
);
}
if ( cacheRead === 0 && cacheCreation === 0 ) {
this . logger . debug ( "No cache hits or creation" );
}
}
What Gets Cached?
With cacheConfig enabled, these are cached:
System instructions
Tools (function declarations)
User messages
// From anthropic-llm.ts:79-90
let system : Anthropic . Messages . MessageCreateParams [ "system" ];
if ( systemInstructionText ) {
system = shouldCache
? [
{
type: "text" ,
text: systemInstructionText ,
cache_control: this . createCacheControl ( cacheTTL ),
},
]
: systemInstructionText ;
}
Streaming
Basic Streaming
const agent = AgentBuilder
. withModel ( 'claude-3-5-sonnet-20241022' )
. build ();
for await ( const chunk of agent . run ( 'Write a story' , { stream: true })) {
process . stdout . write ( chunk . text || '' );
}
Streaming Architecture
Anthropic uses event-based streaming (different from OpenAI):
// From anthropic-llm.ts:204-262
for await ( const event of streamResponse ) {
switch ( event . type ) {
case "message_start" :
// Input tokens reported here
inputTokens = event . message . usage . input_tokens ;
break ;
case "content_block_start" :
// New content block (text or tool_use)
contentBlocks . set ( event . index , {
type: event . content_block . type ,
... ( event . content_block . type === "tool_use" && {
id: event . content_block . id ,
name: event . content_block . name ,
inputJson: "" ,
}),
});
break ;
case "content_block_delta" :
// Incremental text or tool input JSON
if ( event . delta . type === "text_delta" ) {
const deltaText = event . delta . text ;
// Handle thought mode and regular text
} else if ( event . delta . type === "input_json_delta" ) {
// Accumulate tool input JSON
}
break ;
case "message_delta" :
// Final usage and stop_reason
outputTokens = event . usage . output_tokens ;
break ;
}
}
Thought Detection
The provider detects and tags thinking content:
// From anthropic-llm.ts:13-24
const THOUGHT_OPEN_TAGS = [
"<thinking>" ,
"[thinking]" ,
"<thought>" ,
"[thought]" ,
];
const THOUGHT_CLOSE_TAGS = [
"</thinking>" ,
"[/thinking]" ,
"</thought>" ,
"[/thought]" ,
];
Function Calling
import { AgentBuilder , BaseTool } from '@iqai/adk' ;
import { z } from 'zod/v4' ;
class DatabaseTool extends BaseTool {
name = 'query_database' ;
description = 'Query the user database' ;
inputSchema = z . object ({
query: z . string (). describe ( 'SQL query to execute' ),
limit: z . number (). default ( 10 )
});
async execute ( input : { query : string ; limit : number }) {
// Execute query
return {
rows: [
{ id: 1 , name: 'Alice' },
{ id: 2 , name: 'Bob' }
],
count: 2
};
}
}
const agent = AgentBuilder
. withModel ( 'claude-3-5-sonnet-20241022' )
. withTools ( new DatabaseTool ())
. withCacheConfig ({ ttlSeconds: 3600 }) // Cache tools!
. build ();
const response = await agent . ask ( 'Find all users named Alice' );
The provider converts ADK tools to Anthropic format:
// From anthropic-llm.ts:437-456
private functionDeclarationToAnthropicTool (
functionDeclaration : any
): Anthropic . Tool {
const properties : Record < string , any > = {};
if ( functionDeclaration . parameters ?. properties ) {
for ( const [ key , value ] of Object . entries (
functionDeclaration . parameters . properties ,
)) {
const valueDict = { ... ( value as any ) };
this . updateTypeString ( valueDict );
properties [ key ] = valueDict ;
}
}
return {
name: functionDeclaration . name ,
description: functionDeclaration . description || "" ,
input_schema: { type: "object" , properties },
};
}
Role Mapping
// From anthropic-llm.ts:461-464
private toAnthropicRole ( role ?: string ): AnthropicRole {
if ( role === "model" || role === "assistant" ) return "assistant" ;
return "user" ;
}
Content Blocks
Anthropic uses structured content blocks:
// From anthropic-llm.ts:401-420
private partToAnthropicBlock (
part : any
): Anthropic . MessageParam [ "content" ][ 0 ] {
if ( part . text ) return { type: "text" , text: part . text };
if ( part . function_call )
return {
type: "tool_use" ,
id: part . function_call . id || "" ,
name: part . function_call . name ,
input: part . function_call . args || {},
};
if ( part . function_response )
return {
type: "tool_result" ,
tool_use_id: part . function_response . id || "" ,
content: String ( part . function_response . response ?. result || "" ),
is_error: false ,
};
throw new Error ( "Unsupported part type for Anthropic conversion" );
}
Error Handling
Rate Limit Errors
import { RateLimitError } from '@iqai/adk' ;
try {
const response = await agent . ask ( 'Hello' );
} catch ( error ) {
if ( error instanceof RateLimitError ) {
console . log ( 'Rate limited!' );
console . log ( 'Provider:' , error . provider ); // 'anthropic'
console . log ( 'Model:' , error . model );
console . log ( 'Retry after:' , error . retryAfter );
}
}
Usage Tracking
Anthropic provides detailed usage metadata including cache metrics:
const response = await agent . ask ( 'Hello' );
if ( response . usageMetadata ) {
console . log ( 'Input tokens:' , response . usageMetadata . promptTokenCount );
console . log ( 'Output tokens:' , response . usageMetadata . candidatesTokenCount );
console . log ( 'Total tokens:' , response . usageMetadata . totalTokenCount );
// Cache metrics (if caching enabled)
if ( response . usageMetadata . cacheReadInputTokens ) {
console . log ( 'Cache read:' , response . usageMetadata . cacheReadInputTokens );
}
if ( response . usageMetadata . cacheCreationInputTokens ) {
console . log ( 'Cache created:' , response . usageMetadata . cacheCreationInputTokens );
}
}
From the source:
// From anthropic-llm.ts:361-374
const usage = message . usage as any ;
if ( usage . cache_read_input_tokens !== undefined ) {
usageMetadata . cacheReadInputTokens = usage . cache_read_input_tokens ;
}
if ( usage . cache_creation_input_tokens !== undefined ) {
usageMetadata . cacheCreationInputTokens =
usage . cache_creation_input_tokens ;
}
Best Practices
Use Claude 3.5 Sonnet for most applications (best balance)
Use Claude 3 Opus for complex reasoning requiring highest intelligence
Use Claude 3 Haiku for fast, cost-effective responses
Enable caching for conversations with >1 repeated query
Use 1-hour TTL for extended sessions
Cache system instructions, tools, and large context
Monitor cacheReadInputTokens to verify cache hits
Claude supports 200K context (much larger than GPT-4)
Use for large document analysis, codebases, conversations
Combine with caching for cost-effective large context
Use streaming for long responses (>500 tokens)
Handle thought/thinking tags for reasoning transparency
Process message_delta event for final usage stats
Advanced Features
System Messages with Caching
const largeSystemPrompt = `
You are an expert code reviewer with knowledge of:
- TypeScript, React, Node.js
- Security best practices
- Performance optimization
- [... large prompt continues ...]
` ;
const agent = AgentBuilder
. withModel ( 'claude-3-5-sonnet-20241022' )
. withInstruction ( largeSystemPrompt )
. withCacheConfig ({ ttlSeconds: 3600 })
. build ();
// First request: pays for prompt, creates cache
const review1 = await agent . ask ( 'Review this TypeScript code: ...' );
// Subsequent requests: cache hit, 90% cheaper!
const review2 = await agent . ask ( 'Review this React component: ...' );
import { AgentBuilder } from '@iqai/adk' ;
const agent = AgentBuilder
. withModel ( 'claude-3-5-sonnet-20241022' )
. withTools ( databaseTool , apiTool )
. withCacheConfig ({ ttlSeconds: 3600 })
. withQuickSession ()
. build ();
// Multi-turn conversation with tool use
const response1 = await agent . ask ( 'Find user Alice' );
const response2 = await agent . ask ( 'Now check her orders' );
const response3 = await agent . ask ( 'What was her most recent order?' );
Limitations
No Live Connections : Anthropic models do not support live/bidirectional connections. The connect() method will throw an error.
Max Output Tokens : Anthropic requires maxOutputTokens to be set. The default is 1024. Increase for longer responses.
Next Steps
OpenAI Provider Compare with GPT models
Google Provider Explore Gemini’s context caching
Tools & Function Calling Build custom tools
Sessions & State Manage conversation state