Skip to main content

Overview

The Anthropic provider (AnthropicLlm) provides access to Claude 3 and Claude 4 models with advanced features including prompt caching, extended context windows (up to 200K tokens), and sophisticated streaming. Source: packages/adk/src/models/anthropic-llm.ts:34

Supported Models

The Anthropic provider matches these model patterns:
// From anthropic-llm.ts:45-47
static override supportedModels(): string[] {
  return ["claude-3-.*", "claude-.*-4.*"];
}

Model Examples

import { AgentBuilder } from '@iqai/adk';

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .build();
Best for: Most intelligent model, best for complex tasks
  • Context: 200K tokens
  • Output: 8K tokens
  • Prompt caching: Yes

Configuration

API Key Setup

Set your Anthropic API key:
.env
ANTHROPIC_API_KEY=sk-ant-...
The provider automatically reads from environment:
// From anthropic-llm.ts:500-510
private get client(): Anthropic {
  if (!this._client) {
    const apiKey = process.env.ANTHROPIC_API_KEY;
    if (!apiKey)
      throw new Error(
        "ANTHROPIC_API_KEY environment variable is required for Anthropic models"
      );
    this._client = new Anthropic({ apiKey });
  }
  return this._client;
}

Basic Usage

import { AgentBuilder } from '@iqai/adk';

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .withInstruction('You are a helpful assistant')
  .build();

const response = await agent.ask('Explain quantum computing');
console.log(response.text);

Configuration Options

model
string
default:"claude-3-5-sonnet-20241022"
The Anthropic model to use
maxOutputTokens
number
default:"1024"
Maximum tokens to generate (required by Anthropic)
temperature
number
default:"1.0"
Controls randomness (0.0 - 1.0). Lower is more focused
topP
number
default:"1.0"
Nucleus sampling parameter (0.0 - 1.0)
const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .withConfig({
    maxOutputTokens: 4096,
    temperature: 0.7,
    topP: 0.9
  })
  .build();

Prompt Caching

Anthropic’s prompt caching can reduce costs by up to 90% and latency by up to 85% for requests with repeated context.

Enable Caching

import { AgentBuilder } from '@iqai/adk';

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .withInstruction('You are a code review assistant...')
  .withCacheConfig({
    ttlSeconds: 3600 // 1 hour
  })
  .build();

Cache TTL Options

Anthropic supports two TTL durations:
.withCacheConfig({ ttlSeconds: 300 })
Use for: Short-lived conversations, testing
  • Cost: Lower cache creation cost
  • Best for: Quick interactions
The provider automatically selects the TTL:
// From anthropic-llm.ts:309-315
private getCacheTTL(llmRequest: LlmRequest): "5m" | "1h" {
  if (!llmRequest.cacheConfig) return "5m";
  return llmRequest.cacheConfig.ttlSeconds >
    ANTHROPIC_CACHE_LONG_TTL_THRESHOLD // 1800 seconds (30 min)
    ? "1h"
    : "5m";
}

Cache Performance Logging

The provider logs cache hits and creation:
// From anthropic-llm.ts:328-345
private logCachePerformance(usage: Anthropic.Messages.Usage): void {
  const cacheRead = usage.cache_read_input_tokens || 0;
  const cacheCreation = usage.cache_creation_input_tokens || 0;

  if (cacheRead > 0) {
    this.logger.info(`Cache HIT: ${cacheRead} tokens read from cache`);
  }

  if (cacheCreation > 0) {
    this.logger.info(
      `Cache CREATED: ${cacheCreation} tokens written to cache`,
    );
  }

  if (cacheRead === 0 && cacheCreation === 0) {
    this.logger.debug("No cache hits or creation");
  }
}

What Gets Cached?

With cacheConfig enabled, these are cached:
  1. System instructions
  2. Tools (function declarations)
  3. User messages
// From anthropic-llm.ts:79-90
let system: Anthropic.Messages.MessageCreateParams["system"];
if (systemInstructionText) {
  system = shouldCache
    ? [
        {
          type: "text",
          text: systemInstructionText,
          cache_control: this.createCacheControl(cacheTTL),
        },
      ]
    : systemInstructionText;
}

Streaming

Basic Streaming

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .build();

for await (const chunk of agent.run('Write a story', { stream: true })) {
  process.stdout.write(chunk.text || '');
}

Streaming Architecture

Anthropic uses event-based streaming (different from OpenAI):
// From anthropic-llm.ts:204-262
for await (const event of streamResponse) {
  switch (event.type) {
    case "message_start":
      // Input tokens reported here
      inputTokens = event.message.usage.input_tokens;
      break;

    case "content_block_start":
      // New content block (text or tool_use)
      contentBlocks.set(event.index, {
        type: event.content_block.type,
        ...(event.content_block.type === "tool_use" && {
          id: event.content_block.id,
          name: event.content_block.name,
          inputJson: "",
        }),
      });
      break;

    case "content_block_delta":
      // Incremental text or tool input JSON
      if (event.delta.type === "text_delta") {
        const deltaText = event.delta.text;
        // Handle thought mode and regular text
      } else if (event.delta.type === "input_json_delta") {
        // Accumulate tool input JSON
      }
      break;

    case "message_delta":
      // Final usage and stop_reason
      outputTokens = event.usage.output_tokens;
      break;
  }
}

Thought Detection

The provider detects and tags thinking content:
// From anthropic-llm.ts:13-24
const THOUGHT_OPEN_TAGS = [
  "<thinking>",
  "[thinking]",
  "<thought>",
  "[thought]",
];
const THOUGHT_CLOSE_TAGS = [
  "</thinking>",
  "[/thinking]",
  "</thought>",
  "[/thought]",
];

Function Calling

With ADK Tools

import { AgentBuilder, BaseTool } from '@iqai/adk';
import { z } from 'zod/v4';

class DatabaseTool extends BaseTool {
  name = 'query_database';
  description = 'Query the user database';
  inputSchema = z.object({
    query: z.string().describe('SQL query to execute'),
    limit: z.number().default(10)
  });

  async execute(input: { query: string; limit: number }) {
    // Execute query
    return {
      rows: [
        { id: 1, name: 'Alice' },
        { id: 2, name: 'Bob' }
      ],
      count: 2
    };
  }
}

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .withTools(new DatabaseTool())
  .withCacheConfig({ ttlSeconds: 3600 }) // Cache tools!
  .build();

const response = await agent.ask('Find all users named Alice');

Tool Conversion

The provider converts ADK tools to Anthropic format:
// From anthropic-llm.ts:437-456
private functionDeclarationToAnthropicTool(
  functionDeclaration: any
): Anthropic.Tool {
  const properties: Record<string, any> = {};
  if (functionDeclaration.parameters?.properties) {
    for (const [key, value] of Object.entries(
      functionDeclaration.parameters.properties,
    )) {
      const valueDict = { ...(value as any) };
      this.updateTypeString(valueDict);
      properties[key] = valueDict;
    }
  }

  return {
    name: functionDeclaration.name,
    description: functionDeclaration.description || "",
    input_schema: { type: "object", properties },
  };
}

Message Format

Role Mapping

// From anthropic-llm.ts:461-464
private toAnthropicRole(role?: string): AnthropicRole {
  if (role === "model" || role === "assistant") return "assistant";
  return "user";
}

Content Blocks

Anthropic uses structured content blocks:
// From anthropic-llm.ts:401-420
private partToAnthropicBlock(
  part: any
): Anthropic.MessageParam["content"][0] {
  if (part.text) return { type: "text", text: part.text };
  if (part.function_call)
    return {
      type: "tool_use",
      id: part.function_call.id || "",
      name: part.function_call.name,
      input: part.function_call.args || {},
    };
  if (part.function_response)
    return {
      type: "tool_result",
      tool_use_id: part.function_response.id || "",
      content: String(part.function_response.response?.result || ""),
      is_error: false,
    };
  throw new Error("Unsupported part type for Anthropic conversion");
}

Error Handling

Rate Limit Errors

import { RateLimitError } from '@iqai/adk';

try {
  const response = await agent.ask('Hello');
} catch (error) {
  if (error instanceof RateLimitError) {
    console.log('Rate limited!');
    console.log('Provider:', error.provider); // 'anthropic'
    console.log('Model:', error.model);
    console.log('Retry after:', error.retryAfter);
  }
}

Usage Tracking

Anthropic provides detailed usage metadata including cache metrics:
const response = await agent.ask('Hello');

if (response.usageMetadata) {
  console.log('Input tokens:', response.usageMetadata.promptTokenCount);
  console.log('Output tokens:', response.usageMetadata.candidatesTokenCount);
  console.log('Total tokens:', response.usageMetadata.totalTokenCount);
  
  // Cache metrics (if caching enabled)
  if (response.usageMetadata.cacheReadInputTokens) {
    console.log('Cache read:', response.usageMetadata.cacheReadInputTokens);
  }
  if (response.usageMetadata.cacheCreationInputTokens) {
    console.log('Cache created:', response.usageMetadata.cacheCreationInputTokens);
  }
}
From the source:
// From anthropic-llm.ts:361-374
const usage = message.usage as any;
if (usage.cache_read_input_tokens !== undefined) {
  usageMetadata.cacheReadInputTokens = usage.cache_read_input_tokens;
}
if (usage.cache_creation_input_tokens !== undefined) {
  usageMetadata.cacheCreationInputTokens =
    usage.cache_creation_input_tokens;
}

Best Practices

  • Use Claude 3.5 Sonnet for most applications (best balance)
  • Use Claude 3 Opus for complex reasoning requiring highest intelligence
  • Use Claude 3 Haiku for fast, cost-effective responses
  • Enable caching for conversations with >1 repeated query
  • Use 1-hour TTL for extended sessions
  • Cache system instructions, tools, and large context
  • Monitor cacheReadInputTokens to verify cache hits
  • Claude supports 200K context (much larger than GPT-4)
  • Use for large document analysis, codebases, conversations
  • Combine with caching for cost-effective large context
  • Use streaming for long responses (>500 tokens)
  • Handle thought/thinking tags for reasoning transparency
  • Process message_delta event for final usage stats

Advanced Features

System Messages with Caching

const largeSystemPrompt = `
You are an expert code reviewer with knowledge of:
- TypeScript, React, Node.js
- Security best practices
- Performance optimization
- [... large prompt continues ...]
`;

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .withInstruction(largeSystemPrompt)
  .withCacheConfig({ ttlSeconds: 3600 })
  .build();

// First request: pays for prompt, creates cache
const review1 = await agent.ask('Review this TypeScript code: ...');

// Subsequent requests: cache hit, 90% cheaper!
const review2 = await agent.ask('Review this React component: ...');

Multi-turn with Tools

import { AgentBuilder } from '@iqai/adk';

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .withTools(databaseTool, apiTool)
  .withCacheConfig({ ttlSeconds: 3600 })
  .withQuickSession()
  .build();

// Multi-turn conversation with tool use
const response1 = await agent.ask('Find user Alice');
const response2 = await agent.ask('Now check her orders');
const response3 = await agent.ask('What was her most recent order?');

Limitations

No Live Connections: Anthropic models do not support live/bidirectional connections. The connect() method will throw an error.
Max Output Tokens: Anthropic requires maxOutputTokens to be set. The default is 1024. Increase for longer responses.

Next Steps

OpenAI Provider

Compare with GPT models

Google Provider

Explore Gemini’s context caching

Tools & Function Calling

Build custom tools

Sessions & State

Manage conversation state

Build docs developers (and LLMs) love