Anthropic Provider

Overview

The Anthropic provider (AnthropicLlm) provides access to Claude 3 and Claude 4 models with advanced features including prompt caching, extended context windows (up to 200K tokens), and sophisticated streaming. Source: packages/adk/src/models/anthropic-llm.ts:34

Supported Models

The Anthropic provider matches these model patterns:

// From anthropic-llm.ts:45-47
static override supportedModels(): string[] {
  return ["claude-3-.*", "claude-.*-4.*"];
}

Model Examples

Claude 3.5 Sonnet
Claude 3 Opus
Claude 3 Haiku

import { AgentBuilder } from '@iqai/adk';

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .build();

Best for: Most intelligent model, best for complex tasks

Context: 200K tokens
Output: 8K tokens
Prompt caching: Yes

const agent = AgentBuilder
  .withModel('claude-3-opus-20240229')
  .build();

Best for: Highest intelligence, complex reasoning

Context: 200K tokens
Output: 4K tokens

const agent = AgentBuilder
  .withModel('claude-3-haiku-20240307')
  .build();

Best for: Fast responses, cost-effective

Context: 200K tokens
Output: 4K tokens

Configuration

API Key Setup

Set your Anthropic API key:

.env

ANTHROPIC_API_KEY=sk-ant-...

The provider automatically reads from environment:

// From anthropic-llm.ts:500-510
private get client(): Anthropic {
  if (!this._client) {
    const apiKey = process.env.ANTHROPIC_API_KEY;
    if (!apiKey)
      throw new Error(
        "ANTHROPIC_API_KEY environment variable is required for Anthropic models"
      );
    this._client = new Anthropic({ apiKey });
  }
  return this._client;
}

Basic Usage

import { AgentBuilder } from '@iqai/adk';

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .withInstruction('You are a helpful assistant')
  .build();

const response = await agent.ask('Explain quantum computing');
console.log(response.text);

Configuration Options

model

string

default:"claude-3-5-sonnet-20241022"

The Anthropic model to use

maxOutputTokens

number

default:"1024"

Maximum tokens to generate (required by Anthropic)

temperature

number

default:"1.0"

Controls randomness (0.0 - 1.0). Lower is more focused

topP

number

default:"1.0"

Nucleus sampling parameter (0.0 - 1.0)

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .withConfig({
    maxOutputTokens: 4096,
    temperature: 0.7,
    topP: 0.9
  })
  .build();

Prompt Caching

Anthropic’s prompt caching can reduce costs by up to 90% and latency by up to 85% for requests with repeated context.

Enable Caching

import { AgentBuilder } from '@iqai/adk';

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .withInstruction('You are a code review assistant...')
  .withCacheConfig({
    ttlSeconds: 3600 // 1 hour
  })
  .build();

Cache TTL Options

Anthropic supports two TTL durations:

5 Minutes (Default)
1 Hour

.withCacheConfig({ ttlSeconds: 300 })

Use for: Short-lived conversations, testing

Cost: Lower cache creation cost
Best for: Quick interactions

.withCacheConfig({ ttlSeconds: 3600 })

Use for: Long sessions, repeated queries

Cost: Higher cache creation, bigger savings
Best for: Extended conversations, batch processing

The provider automatically selects the TTL:

// From anthropic-llm.ts:309-315
private getCacheTTL(llmRequest: LlmRequest): "5m" | "1h" {
  if (!llmRequest.cacheConfig) return "5m";
  return llmRequest.cacheConfig.ttlSeconds >
    ANTHROPIC_CACHE_LONG_TTL_THRESHOLD // 1800 seconds (30 min)
    ? "1h"
    : "5m";
}

Cache Performance Logging

The provider logs cache hits and creation:

// From anthropic-llm.ts:328-345
private logCachePerformance(usage: Anthropic.Messages.Usage): void {
  const cacheRead = usage.cache_read_input_tokens || 0;
  const cacheCreation = usage.cache_creation_input_tokens || 0;

  if (cacheRead > 0) {
    this.logger.info(`Cache HIT: ${cacheRead} tokens read from cache`);
  }

  if (cacheCreation > 0) {
    this.logger.info(
      `Cache CREATED: ${cacheCreation} tokens written to cache`,
    );
  }

  if (cacheRead === 0 && cacheCreation === 0) {
    this.logger.debug("No cache hits or creation");
  }
}

What Gets Cached?

With cacheConfig enabled, these are cached:

System instructions
Tools (function declarations)
User messages

// From anthropic-llm.ts:79-90
let system: Anthropic.Messages.MessageCreateParams["system"];
if (systemInstructionText) {
  system = shouldCache
    ? [
        {
          type: "text",
          text: systemInstructionText,
          cache_control: this.createCacheControl(cacheTTL),
        },
      ]
    : systemInstructionText;
}

Streaming

Basic Streaming

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .build();

for await (const chunk of agent.run('Write a story', { stream: true })) {
  process.stdout.write(chunk.text || '');
}

Streaming Architecture

Anthropic uses event-based streaming (different from OpenAI):

// From anthropic-llm.ts:204-262
for await (const event of streamResponse) {
  switch (event.type) {
    case "message_start":
      // Input tokens reported here
      inputTokens = event.message.usage.input_tokens;
      break;

    case "content_block_start":
      // New content block (text or tool_use)
      contentBlocks.set(event.index, {
        type: event.content_block.type,
        ...(event.content_block.type === "tool_use" && {
          id: event.content_block.id,
          name: event.content_block.name,
          inputJson: "",
        }),
      });
      break;

    case "content_block_delta":
      // Incremental text or tool input JSON
      if (event.delta.type === "text_delta") {
        const deltaText = event.delta.text;
        // Handle thought mode and regular text
      } else if (event.delta.type === "input_json_delta") {
        // Accumulate tool input JSON
      }
      break;

    case "message_delta":
      // Final usage and stop_reason
      outputTokens = event.usage.output_tokens;
      break;
  }
}

Thought Detection

The provider detects and tags thinking content:

// From anthropic-llm.ts:13-24
const THOUGHT_OPEN_TAGS = [
  "<thinking>",
  "[thinking]",
  "<thought>",
  "[thought]",
];
const THOUGHT_CLOSE_TAGS = [
  "</thinking>",
  "[/thinking]",
  "</thought>",
  "[/thought]",
];

Function Calling

With ADK Tools

import { AgentBuilder, BaseTool } from '@iqai/adk';
import { z } from 'zod/v4';

class DatabaseTool extends BaseTool {
  name = 'query_database';
  description = 'Query the user database';
  inputSchema = z.object({
    query: z.string().describe('SQL query to execute'),
    limit: z.number().default(10)
  });

  async execute(input: { query: string; limit: number }) {
    // Execute query
    return {
      rows: [
        { id: 1, name: 'Alice' },
        { id: 2, name: 'Bob' }
      ],
      count: 2
    };
  }
}

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .withTools(new DatabaseTool())
  .withCacheConfig({ ttlSeconds: 3600 }) // Cache tools!
  .build();

const response = await agent.ask('Find all users named Alice');

Tool Conversion

The provider converts ADK tools to Anthropic format:

// From anthropic-llm.ts:437-456
private functionDeclarationToAnthropicTool(
  functionDeclaration: any
): Anthropic.Tool {
  const properties: Record<string, any> = {};
  if (functionDeclaration.parameters?.properties) {
    for (const [key, value] of Object.entries(
      functionDeclaration.parameters.properties,
    )) {
      const valueDict = { ...(value as any) };
      this.updateTypeString(valueDict);
      properties[key] = valueDict;
    }
  }

  return {
    name: functionDeclaration.name,
    description: functionDeclaration.description || "",
    input_schema: { type: "object", properties },
  };
}

Message Format

Role Mapping

// From anthropic-llm.ts:461-464
private toAnthropicRole(role?: string): AnthropicRole {
  if (role === "model" || role === "assistant") return "assistant";
  return "user";
}

Content Blocks

Anthropic uses structured content blocks:

// From anthropic-llm.ts:401-420
private partToAnthropicBlock(
  part: any
): Anthropic.MessageParam["content"][0] {
  if (part.text) return { type: "text", text: part.text };
  if (part.function_call)
    return {
      type: "tool_use",
      id: part.function_call.id || "",
      name: part.function_call.name,
      input: part.function_call.args || {},
    };
  if (part.function_response)
    return {
      type: "tool_result",
      tool_use_id: part.function_response.id || "",
      content: String(part.function_response.response?.result || ""),
      is_error: false,
    };
  throw new Error("Unsupported part type for Anthropic conversion");
}

Error Handling

Rate Limit Errors

import { RateLimitError } from '@iqai/adk';

try {
  const response = await agent.ask('Hello');
} catch (error) {
  if (error instanceof RateLimitError) {
    console.log('Rate limited!');
    console.log('Provider:', error.provider); // 'anthropic'
    console.log('Model:', error.model);
    console.log('Retry after:', error.retryAfter);
  }
}

Usage Tracking

Anthropic provides detailed usage metadata including cache metrics:

const response = await agent.ask('Hello');

if (response.usageMetadata) {
  console.log('Input tokens:', response.usageMetadata.promptTokenCount);
  console.log('Output tokens:', response.usageMetadata.candidatesTokenCount);
  console.log('Total tokens:', response.usageMetadata.totalTokenCount);
  
  // Cache metrics (if caching enabled)
  if (response.usageMetadata.cacheReadInputTokens) {
    console.log('Cache read:', response.usageMetadata.cacheReadInputTokens);
  }
  if (response.usageMetadata.cacheCreationInputTokens) {
    console.log('Cache created:', response.usageMetadata.cacheCreationInputTokens);
  }
}

From the source:

// From anthropic-llm.ts:361-374
const usage = message.usage as any;
if (usage.cache_read_input_tokens !== undefined) {
  usageMetadata.cacheReadInputTokens = usage.cache_read_input_tokens;
}
if (usage.cache_creation_input_tokens !== undefined) {
  usageMetadata.cacheCreationInputTokens =
    usage.cache_creation_input_tokens;
}

Best Practices

Model Selection

Use Claude 3.5 Sonnet for most applications (best balance)
Use Claude 3 Opus for complex reasoning requiring highest intelligence
Use Claude 3 Haiku for fast, cost-effective responses

Prompt Caching

Enable caching for conversations with >1 repeated query
Use 1-hour TTL for extended sessions
Cache system instructions, tools, and large context
Monitor cacheReadInputTokens to verify cache hits

Context Windows

Claude supports 200K context (much larger than GPT-4)
Use for large document analysis, codebases, conversations
Combine with caching for cost-effective large context

Streaming

Use streaming for long responses (>500 tokens)
Handle thought/thinking tags for reasoning transparency
Process message_delta event for final usage stats

Advanced Features

System Messages with Caching

const largeSystemPrompt = `
You are an expert code reviewer with knowledge of:
- TypeScript, React, Node.js
- Security best practices
- Performance optimization
- [... large prompt continues ...]
`;

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .withInstruction(largeSystemPrompt)
  .withCacheConfig({ ttlSeconds: 3600 })
  .build();

// First request: pays for prompt, creates cache
const review1 = await agent.ask('Review this TypeScript code: ...');

// Subsequent requests: cache hit, 90% cheaper!
const review2 = await agent.ask('Review this React component: ...');

Multi-turn with Tools

import { AgentBuilder } from '@iqai/adk';

const agent = AgentBuilder
  .withModel('claude-3-5-sonnet-20241022')
  .withTools(databaseTool, apiTool)
  .withCacheConfig({ ttlSeconds: 3600 })
  .withQuickSession()
  .build();

// Multi-turn conversation with tool use
const response1 = await agent.ask('Find user Alice');
const response2 = await agent.ask('Now check her orders');
const response3 = await agent.ask('What was her most recent order?');

Limitations

No Live Connections: Anthropic models do not support live/bidirectional connections. The connect() method will throw an error.

Max Output Tokens: Anthropic requires maxOutputTokens to be set. The default is 1024. Increase for longer responses.

Next Steps

OpenAI Provider

Compare with GPT models

Google Provider

Explore Gemini’s context caching

Tools & Function Calling

Build custom tools

Sessions & State

Manage conversation state

Getting Started

Core Concepts

Agents

Models & Providers

Tools

Memory & State

Advanced Features

CLI Tool

Examples

​Overview

​Supported Models

​Model Examples

​Configuration

​API Key Setup

​Basic Usage

​Configuration Options

​Prompt Caching

​Enable Caching

​Cache TTL Options

​Cache Performance Logging

​What Gets Cached?

​Streaming

​Basic Streaming

​Streaming Architecture

​Thought Detection

​Function Calling

​With ADK Tools

​Tool Conversion

​Message Format

​Role Mapping

​Content Blocks

​Error Handling

​Rate Limit Errors

​Usage Tracking

​Best Practices

​Advanced Features

​System Messages with Caching

​Multi-turn with Tools

​Limitations

​Next Steps

OpenAI Provider

Google Provider

Tools & Function Calling

Sessions & State

Build docs developers (and LLMs) love

Overview

Supported Models

Model Examples

Configuration

API Key Setup

Basic Usage

Configuration Options

Prompt Caching

Enable Caching

Cache TTL Options

Cache Performance Logging

What Gets Cached?

Streaming

Basic Streaming

Streaming Architecture

Thought Detection

Function Calling

With ADK Tools

Tool Conversion

Message Format

Role Mapping

Content Blocks

Error Handling

Rate Limit Errors

Usage Tracking

Best Practices

Advanced Features

System Messages with Caching

Multi-turn with Tools

Limitations

Next Steps