Skip to main content

Overview

The Anthropic provider integrates Claude models with LlamaIndex.TS, supporting all Claude 3, 3.5, 3.7, and 4 series models.

Installation

npm install @llamaindex/anthropic

Basic Usage

import { Anthropic } from "@llamaindex/anthropic";

const llm = new Anthropic({
  model: "claude-3-5-sonnet-20241022",
  temperature: 0.7
});

const response = await llm.chat({
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Explain quantum computing" }
  ]
});

console.log(response.message.content);

Constructor Options

model
string
default:"claude-3-opus"
Claude model name
temperature
number
default:1
Sampling temperature (0-1)
topP
number
Nucleus sampling parameter
maxTokens
number
Maximum tokens in response
apiKey
string
Anthropic API key (defaults to ANTHROPIC_API_KEY env variable)
maxRetries
number
default:10
Maximum request retries
timeout
number
default:60000
Request timeout in milliseconds
additionalChatOptions
AnthropicAdditionalChatOptions
Additional options like thinking mode

Supported Models

Claude 4 (Latest)

  • claude-4-5-sonnet / claude-sonnet-4-5-20250929
  • claude-4-1-opus / claude-opus-4-1-20250805
  • claude-4-0-opus / claude-opus-4-20250514
  • claude-4-0-sonnet / claude-sonnet-4-20250514

Claude 3.7

  • claude-3-7-sonnet / claude-3-7-sonnet-20250219

Claude 3.5

  • claude-3-5-sonnet-20241022
  • claude-3-5-sonnet-20240620
  • claude-3-5-haiku-20241022

Claude 3

  • claude-3-opus-20240229
  • claude-3-sonnet-20240229
  • claude-3-haiku-20240307
All models support 200K context window.

Streaming

const stream = await llm.chat({
  messages: [{ role: "user", content: "Write a short story" }],
  stream: true
});

for await (const chunk of stream) {
  process.stdout.write(chunk.delta);
}

Function Calling

Claude 3+ models support tool use:
import { tool } from "@llamaindex/core/tools";
import { z } from "zod";

const weatherTool = tool({
  name: "get_weather",
  description: "Get weather for a location",
  parameters: z.object({
    location: z.string(),
    units: z.enum(["celsius", "fahrenheit"]).optional()
  }),
  execute: async ({ location, units = "celsius" }) => {
    return `Weather in ${location}: 22°${units === "celsius" ? "C" : "F"}`;
  }
});

const response = await llm.chat({
  messages: [{ role: "user", content: "What's the weather in London?" }],
  tools: [weatherTool]
});

Extended Thinking

Claude 4+ supports extended thinking blocks:
const llm = new Anthropic({
  model: "claude-4-5-sonnet",
  additionalChatOptions: {
    thinking: "enabled"  // Enable thinking blocks
  }
});

const response = await llm.chat({
  messages: [{ role: "user", content: "Solve this complex problem..." }]
});

// Access thinking process
if (response.message.options?.thinking) {
  console.log("Thinking:", response.message.options.thinking);
  console.log("Signature:", response.message.options.thinking_signature);
}

Multi-modal Input

Images

const response = await llm.chat({
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Describe this image" },
        {
          type: "image",
          data: imageBase64,
          mimeType: "image/jpeg"  // or png, gif, webp
        }
      ]
    }
  ]
});

PDFs

const response = await llm.chat({
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Summarize this PDF" },
        {
          type: "file",
          data: pdfBase64,
          mimeType: "application/pdf"
        }
      ]
    }
  ]
});

Prompt Caching (Beta)

Use cache control to reduce costs for repeated content:
const response = await llm.chat({
  messages: [
    {
      role: "system",
      content: "You are an expert on this large document...",
      options: {
        cache_control: { type: "ephemeral" }
      }
    },
    { role: "user", content: "What does it say about topic X?" }
  ]
});

Configuration

Environment Variables

ANTHROPIC_API_KEY=sk-ant-...

Global Settings

import { Settings } from "llamaindex";
import { Anthropic } from "@llamaindex/anthropic";

Settings.llm = new Anthropic({
  model: "claude-3-5-sonnet-20241022"
});

Response Format

Note: Anthropic does not support native structured output. Use tool calling for structured data:
const extractionTool = tool({
  name: "extract_info",
  description: "Extract structured information",
  parameters: z.object({
    name: z.string(),
    age: z.number(),
    email: z.string().email()
  }),
  execute: async (data) => data
});

const response = await llm.chat({
  messages: [{ role: "user", content: "Extract: John is 30, [email protected]" }],
  tools: [extractionTool]
});

Message Handling

Claude has specific requirements:

System Messages

System messages are handled separately:
const response = await llm.chat({
  messages: [
    { role: "system", content: "System prompt" },  // Extracted automatically
    { role: "user", content: "User message" },
    { role: "assistant", content: "Assistant response" },
    { role: "user", content: "Follow-up" }
  ]
});

Message Alternation

Messages must alternate between user and assistant:
// ✅ Correct
[
  { role: "user", content: "Message 1" },
  { role: "assistant", content: "Response 1" },
  { role: "user", content: "Message 2" }
]

// ❌ Incorrect - consecutive user messages
[
  { role: "user", content: "Message 1" },
  { role: "user", content: "Message 2" }  // Will be merged automatically
]
The provider automatically merges consecutive messages.

Model Capabilities

ModelTool UseVisionPDFsThinking
Claude 4
Claude 3.7
Claude 3.5
Claude 3
Claude 2

Error Handling

try {
  const response = await llm.chat({ messages });
} catch (error) {
  if (error.status === 429) {
    console.error("Rate limit exceeded");
  } else if (error.status === 401) {
    console.error("Invalid API key");
  } else {
    console.error("Error:", error.message);
  }
}

Best Practices

  1. Use latest models: Claude 4.5 Sonnet offers best performance
  2. Enable thinking for complex tasks: Improves reasoning quality
  3. Use cache control: Reduce costs for repeated content
  4. Handle message alternation: Provider handles this automatically
  5. Leverage vision: Claude excels at image and PDF analysis
  6. Set appropriate maxTokens: Default is 4096, adjust as needed

See Also

Build docs developers (and LLMs) love