Skip to main content

Overview

Amazon Bedrock provides access to foundation models from leading AI companies including Anthropic, Meta, Mistral, Cohere, and Amazon through a unified API with AWS security, compliance, and infrastructure. Service: bedrock (data plane) and bedrock-runtime (inference)

Supported Features

  • ✅ Chat Completions (Converse API)
  • ✅ Streaming
  • ✅ Embeddings
  • ✅ Image Generation (Stable Diffusion, Titan)
  • ✅ Function Calling (via Converse API)
  • ✅ Batch Inference
  • ✅ Model Customization (Fine-tuning)
  • ✅ Guardrails
  • ✅ Multiple Authentication Methods

Quick Start

Basic Configuration

from portkey_ai import Portkey

client = Portkey(
    provider="bedrock",
    aws_access_key_id="AKIA***",
    aws_secret_access_key="***",
    aws_region="us-east-1"
)

response = client.chat.completions.create(
    model="anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[
        {"role": "user", "content": "Explain AWS Bedrock in simple terms"}
    ]
)

print(response.choices[0].message.content)

Available Models

Anthropic Claude

Model IDModelContextBest For
anthropic.claude-3-5-sonnet-20241022-v2:0Claude 3.5 Sonnet200KMost capable
anthropic.claude-3-5-haiku-20241022-v1:0Claude 3.5 Haiku200KFast, efficient
anthropic.claude-3-opus-20240229-v1:0Claude 3 Opus200KComplex tasks
anthropic.claude-3-sonnet-20240229-v1:0Claude 3 Sonnet200KBalanced
anthropic.claude-3-haiku-20240307-v1:0Claude 3 Haiku200KSpeed

Meta Llama

Model IDContextDescription
meta.llama3-3-70b-instruct-v1:0128KLatest Llama 3.3
meta.llama3-1-405b-instruct-v1:0128KLargest Llama 3.1
meta.llama3-1-70b-instruct-v1:0128KEfficient Llama 3.1
meta.llama3-1-8b-instruct-v1:0128KFast, compact

Mistral AI

Model IDContextDescription
mistral.mistral-large-2407-v1:0128KMost capable
mistral.mistral-large-2402-v1:032KPrevious generation
mistral.mistral-small-2402-v1:032KCost-effective

Amazon Titan

Model IDTypeDescription
amazon.titan-text-premier-v1:0TextPremier text model
amazon.titan-text-express-v1TextFast generation
amazon.titan-embed-text-v2:0EmbeddingsText embeddings
amazon.titan-image-generator-v2:0ImageImage generation

Cohere

Model IDTypeDescription
cohere.command-r-plus-v1:0ChatMost capable
cohere.command-r-v1:0ChatBalanced
cohere.embed-english-v3EmbeddingsEnglish embeddings
cohere.embed-multilingual-v3EmbeddingsMultilingual

AI21 Labs

Model IDDescription
ai21.jamba-1-5-large-v1:0Latest Jamba
ai21.jamba-1-5-mini-v1:0Compact Jamba

Stability AI

Model IDTypeDescription
stability.stable-diffusion-xl-v1ImageSDXL 1.0
stability.sd3-large-v1:0ImageStable Diffusion 3

Authentication Methods

1. Access Keys (Default)

client = Portkey(
    provider="bedrock",
    aws_access_key_id="AKIA***",
    aws_secret_access_key="***",
    aws_session_token="***",  # Optional for temporary credentials
    aws_region="us-east-1"
)

2. Assumed Role

client = Portkey(
    provider="bedrock",
    aws_auth_type="assumedRole",
    aws_role_arn="arn:aws:iam::123456789012:role/BedrockRole",
    aws_external_id="external-id",  # Optional
    aws_region="us-east-1"
)

3. IAM Role (EC2, ECS, Lambda)

# Automatically uses instance/container IAM role
client = Portkey(
    provider="bedrock",
    aws_region="us-east-1"
)

4. Environment Variables

export AWS_ACCESS_KEY_ID="AKIA***"
export AWS_SECRET_ACCESS_KEY="***"
export AWS_REGION="us-east-1"
client = Portkey(provider="bedrock")

Advanced Features

Streaming

stream = client.chat.completions.create(
    model="anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": "Count to 10"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Function Calling (Converse API)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=tools
)

Embeddings

response = client.embeddings.create(
    model="amazon.titan-embed-text-v2:0",
    input="AWS Bedrock provides access to foundation models"
)

embedding = response.data[0].embedding
print(f"Dimensions: {len(embedding)}")

Image Generation

response = client.images.generate(
    model="stability.sd3-large-v1:0",
    prompt="A serene mountain landscape at sunset",
    size="1024x1024"
)

image_url = response.data[0].url

Batch Inference

Create batch jobs for cost-effective inference:
# Create batch job
response = client.batches.create(
    model="anthropic.claude-3-5-sonnet-20241022-v2:0",
    input_file_id="s3://my-bucket/input.jsonl",
    output_data_config={
        "s3OutputDataConfig": {
            "s3Uri": "s3://my-bucket/output/"
        }
    }
)

batch_id = response.id

# Check status
batch = client.batches.retrieve(batch_id)
print(f"Status: {batch.status}")

Cross-Region Inference

Use inference profiles for cross-region routing:
response = client.chat.completions.create(
    model="us.anthropic.claude-3-5-sonnet-20241022-v2:0",  # Inference profile
    messages=[{"role": "user", "content": "Hello"}]
)

Multi-Region Configuration

Load balance across AWS regions:
config = {
    "strategy": {"mode": "loadbalance"},
    "targets": [
        {
            "provider": "bedrock",
            "aws_access_key_id": "AKIA***",
            "aws_secret_access_key": "***",
            "aws_region": "us-east-1",
            "weight": 0.5
        },
        {
            "provider": "bedrock",
            "aws_access_key_id": "AKIA***",
            "aws_secret_access_key": "***",
            "aws_region": "us-west-2",
            "weight": 0.5
        }
    ]
}

client = Portkey().with_options(config=config)

Fallback Configuration

Fallback from Bedrock Claude to Anthropic:
config = {
    "strategy": {"mode": "fallback"},
    "targets": [
        {
            "provider": "bedrock",
            "aws_access_key_id": "AKIA***",
            "aws_secret_access_key": "***",
            "aws_region": "us-east-1",
            "override_params": {"model": "anthropic.claude-3-5-sonnet-20241022-v2:0"}
        },
        {
            "provider": "anthropic",
            "api_key": "sk-ant-***",
            "override_params": {"model": "claude-3-5-sonnet-20241022"}
        }
    ]
}

client = Portkey().with_options(config=config)

Error Handling

from portkey_ai.exceptions import (
    RateLimitError,
    APIError,
    AuthenticationError
)

try:
    response = client.chat.completions.create(
        model="anthropic.claude-3-5-sonnet-20241022-v2:0",
        messages=[{"role": "user", "content": "Hello"}]
    )
except AuthenticationError as e:
    print(f"AWS credentials error: {e}")
except RateLimitError as e:
    print(f"Rate limit or quota exceeded: {e}")
except APIError as e:
    print(f"Bedrock API error: {e}")

Best Practices

  1. Use IAM roles - More secure than access keys
  2. Enable VPC endpoints - Private connectivity
  3. Request model access - Models require explicit access approval
  4. Use inference profiles - Better availability and routing
  5. Monitor with CloudWatch - Track usage and costs
  6. Set up guardrails - Content filtering and safety
  7. Use batch inference - Cost-effective for large workloads
  8. Implement retry logic - Handle throttling gracefully

Model Access

Before using models, request access in the AWS Console:
  1. Go to AWS Bedrock Console
  2. Navigate to Model access
  3. Click Manage model access
  4. Select models and request access
  5. Wait for approval (usually instant)
Models are region-specific. Request access in each region you plan to use.

Regional Availability

Bedrock is available in multiple AWS regions:
  • US: us-east-1, us-west-2
  • Europe: eu-central-1, eu-west-1, eu-west-3
  • Asia Pacific: ap-southeast-1, ap-northeast-1, ap-south-1
Model availability varies by region. Check the AWS Bedrock documentation for details.

Pricing

Bedrock pricing includes:
  • On-demand: Pay per request/token
  • Provisioned throughput: Reserved capacity
  • Model customization: Additional costs for fine-tuning

AWS Bedrock Pricing

View detailed Bedrock pricing

Anthropic

Direct Anthropic integration

Load Balancing

Multi-region load balancing

Guardrails

Content filtering

Batch Processing

Batch inference guide

Build docs developers (and LLMs) love