Skip to main content

Overview

Anthropic develops Claude, a family of highly capable AI assistants known for their strong performance, safety features, and long context windows. Portkey provides full support for all Claude models and features. Base URL: https://api.anthropic.com/v1

Supported Features

  • ✅ Messages API (Chat Completions)
  • ✅ Streaming
  • ✅ Tool Use (Function Calling)
  • ✅ Vision (Image inputs)
  • ✅ System Prompts
  • ✅ Token Counting
  • ✅ Batch API
  • ✅ Prompt Caching
  • ❌ Embeddings (not available)
  • ❌ Fine-tuning (not available)

Quick Start

Chat Completions

from portkey_ai import Portkey

client = Portkey(
    provider="anthropic",
    Authorization="sk-ant-***"  # Your Anthropic API key
)

response = client.chat.completions.create(
    model="claude-3-5-sonnet-20241022",
    messages=[
        {"role": "user", "content": "Explain quantum computing in simple terms"}
    ],
    max_tokens=1024
)

print(response.choices[0].message.content)

Streaming Responses

stream = client.chat.completions.create(
    model="claude-3-5-sonnet-20241022",
    messages=[{"role": "user", "content": "Write a haiku about programming"}],
    max_tokens=100,
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Available Models

ModelContext WindowDescriptionBest For
claude-3-5-sonnet-20241022200K tokensLatest, most capable modelComplex tasks, coding, analysis
claude-3-5-haiku-20241022200K tokensFastest Claude 3.5 modelQuick responses, high throughput
claude-3-opus-20240229200K tokensMost powerful Claude 3Highly complex tasks
claude-3-sonnet-20240229200K tokensBalanced performanceGeneral purpose
claude-3-haiku-20240307200K tokensFastest, most compactSimple tasks, cost-effective
Claude models excel at:
  • Long document analysis (200K context)
  • Coding and technical tasks
  • Thoughtful, nuanced responses
  • Following complex instructions
  • Refusing unsafe requests

Configuration Options

Headers

client = Portkey(
    provider="anthropic",
    Authorization="sk-ant-***",
    anthropic_version="2023-06-01",           # API version
    anthropic_beta="prompt-caching-2024-07-31" # Beta features
)
HeaderDescriptionDefaultRequired
AuthorizationAnthropic API key-Yes
anthropic_versionAPI version2023-06-01No
anthropic_betaBeta feature flagsmessages-2023-12-15No

Body Parameters

You can also pass these in the request body:
response = client.chat.completions.create(
    model="claude-3-5-sonnet-20241022",
    messages=[{"role": "user", "content": "Hello"}],
    max_tokens=1024,
    anthropic_version="2023-06-01",  # Can be in body
    anthropic_beta="prompt-caching-2024-07-31"  # Can be in body
)

Advanced Features

System Prompts

Claude supports powerful system prompts:
response = client.chat.completions.create(
    model="claude-3-5-sonnet-20241022",
    messages=[
        {
            "role": "system",
            "content": "You are a helpful AI assistant specialized in Python programming. Provide clear, concise code examples."
        },
        {
            "role": "user",
            "content": "How do I read a CSV file in Python?"
        }
    ],
    max_tokens=500
)

Tool Use (Function Calling)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The unit of temperature"
                    }
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="claude-3-5-sonnet-20241022",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=tools,
    max_tokens=1024
)

if response.choices[0].message.tool_calls:
    tool_call = response.choices[0].message.tool_calls[0]
    print(f"Function: {tool_call.function.name}")
    print(f"Arguments: {tool_call.function.arguments}")

Vision (Image Analysis)

Claude 3 models support image inputs:
response = client.chat.completions.create(
    model="claude-3-5-sonnet-20241022",
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What's in this image? Describe it in detail."
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://example.com/image.jpg"
                }
            }
        ]
    }],
    max_tokens=1024
)
You can also use base64-encoded images:
import base64

with open("image.jpg", "rb") as image_file:
    image_data = base64.b64encode(image_file.read()).decode('utf-8')

response = client.chat.completions.create(
    model="claude-3-5-sonnet-20241022",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "Describe this image"},
            {
                "type": "image_url",
                "image_url": {
                    "url": f"data:image/jpeg;base64,{image_data}"
                }
            }
        ]
    }],
    max_tokens=1024
)

Prompt Caching

Reduce costs by caching frequently used prompts:
client = Portkey(
    provider="anthropic",
    Authorization="sk-ant-***",
    anthropic_beta="prompt-caching-2024-07-31"
)

# Large system prompt that will be cached
large_context = """[Your large context here - documentation, examples, etc.]"""

response = client.chat.completions.create(
    model="claude-3-5-sonnet-20241022",
    messages=[
        {"role": "system", "content": large_context},
        {"role": "user", "content": "Question about the context"}
    ],
    max_tokens=1024
)

Token Counting

Count tokens before making a request:
# Using the native Anthropic API through Portkey
response = client.messages.count_tokens(
    model="claude-3-5-sonnet-20241022",
    messages=[{"role": "user", "content": "Hello, Claude!"}]
)

print(f"Input tokens: {response.input_tokens}")

Fallback Configuration

Use GPT-4 as fallback for Claude:
config = {
    "strategy": {"mode": "fallback"},
    "targets": [
        {
            "provider": "anthropic",
            "api_key": "sk-ant-***",
            "override_params": {"model": "claude-3-5-sonnet-20241022"}
        },
        {
            "provider": "openai",
            "api_key": "sk-***",
            "override_params": {"model": "gpt-4o"}
        }
    ]
}

client = Portkey().with_options(config=config)

response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Hello!"}],
    max_tokens=100
)

Load Balancing

Distribute load across different Claude models:
config = {
    "strategy": {"mode": "loadbalance"},
    "targets": [
        {
            "provider": "anthropic",
            "api_key": "sk-ant-***",
            "override_params": {"model": "claude-3-5-sonnet-20241022"},
            "weight": 0.7
        },
        {
            "provider": "anthropic",
            "api_key": "sk-ant-***",
            "override_params": {"model": "claude-3-5-haiku-20241022"},
            "weight": 0.3
        }
    ]
}

client = Portkey().with_options(config=config)

Error Handling

from portkey_ai.exceptions import (
    RateLimitError,
    APIError,
    AuthenticationError
)

try:
    response = client.chat.completions.create(
        model="claude-3-5-sonnet-20241022",
        messages=[{"role": "user", "content": "Hello"}],
        max_tokens=1024
    )
except RateLimitError as e:
    print(f"Rate limit exceeded: {e}")
except AuthenticationError as e:
    print(f"Invalid API key: {e}")
except APIError as e:
    print(f"API error: {e}")

Request & Response Format

Request

{
  "model": "claude-3-5-sonnet-20241022",
  "messages": [
    {"role": "user", "content": "Hello, Claude!"}
  ],
  "max_tokens": 1024,
  "temperature": 1.0,
  "top_p": 1.0,
  "top_k": 5
}

Response

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [{
    "type": "text",
    "text": "Hello! How can I assist you today?"
  }],
  "model": "claude-3-5-sonnet-20241022",
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 10,
    "output_tokens": 15
  }
}

Best Practices

  1. Always set max_tokens - Required parameter for Claude
  2. Use system prompts - Claude responds well to detailed system instructions
  3. Leverage long context - Claude handles 200K tokens effectively
  4. Enable prompt caching - Save costs on repeated large contexts
  5. Use Haiku for speed - When fast responses matter more than complexity
  6. Implement streaming - For better user experience with long responses
  7. Add retry logic - Handle rate limits gracefully

Important Differences from OpenAI

FeatureOpenAIAnthropic
max_tokensOptionalRequired
System messagesIn messages arrayIn messages array
Context windowUp to 128KUp to 200K
Embeddings✅ Available❌ Not available
Image generation✅ DALL-E❌ Not available
Audio✅ TTS, STT❌ Not available

Pricing

For up-to-date Anthropic pricing:

Anthropic Pricing

View detailed pricing for all Claude models

AWS Bedrock

Use Claude through AWS Bedrock

Fallback Routing

Set up fallbacks from Anthropic

Prompt Caching

Reduce costs with caching

Tool Use

Advanced tool use guide

Build docs developers (and LLMs) love