Overview
Amazon Bedrock provides access to foundation models from leading AI companies including Anthropic, Meta, Mistral, Cohere, and Amazon through a unified API with AWS security, compliance, and infrastructure.
Service: bedrock (data plane) and bedrock-runtime (inference)
Supported Features
✅ Chat Completions (Converse API)
✅ Streaming
✅ Embeddings
✅ Image Generation (Stable Diffusion, Titan)
✅ Function Calling (via Converse API)
✅ Batch Inference
✅ Model Customization (Fine-tuning)
✅ Guardrails
✅ Multiple Authentication Methods
Quick Start
Basic Configuration
from portkey_ai import Portkey
client = Portkey(
provider = "bedrock" ,
aws_access_key_id = "AKIA***" ,
aws_secret_access_key = "***" ,
aws_region = "us-east-1"
)
response = client.chat.completions.create(
model = "anthropic.claude-3-5-sonnet-20241022-v2:0" ,
messages = [
{ "role" : "user" , "content" : "Explain AWS Bedrock in simple terms" }
]
)
print (response.choices[ 0 ].message.content)
Available Models
Anthropic Claude
Model ID Model Context Best For anthropic.claude-3-5-sonnet-20241022-v2:0Claude 3.5 Sonnet 200K Most capable anthropic.claude-3-5-haiku-20241022-v1:0Claude 3.5 Haiku 200K Fast, efficient anthropic.claude-3-opus-20240229-v1:0Claude 3 Opus 200K Complex tasks anthropic.claude-3-sonnet-20240229-v1:0Claude 3 Sonnet 200K Balanced anthropic.claude-3-haiku-20240307-v1:0Claude 3 Haiku 200K Speed
Model ID Context Description meta.llama3-3-70b-instruct-v1:0128K Latest Llama 3.3 meta.llama3-1-405b-instruct-v1:0128K Largest Llama 3.1 meta.llama3-1-70b-instruct-v1:0128K Efficient Llama 3.1 meta.llama3-1-8b-instruct-v1:0128K Fast, compact
Mistral AI
Model ID Context Description mistral.mistral-large-2407-v1:0128K Most capable mistral.mistral-large-2402-v1:032K Previous generation mistral.mistral-small-2402-v1:032K Cost-effective
Amazon Titan
Model ID Type Description amazon.titan-text-premier-v1:0Text Premier text model amazon.titan-text-express-v1Text Fast generation amazon.titan-embed-text-v2:0Embeddings Text embeddings amazon.titan-image-generator-v2:0Image Image generation
Cohere
Model ID Type Description cohere.command-r-plus-v1:0Chat Most capable cohere.command-r-v1:0Chat Balanced cohere.embed-english-v3Embeddings English embeddings cohere.embed-multilingual-v3Embeddings Multilingual
AI21 Labs
Model ID Description ai21.jamba-1-5-large-v1:0Latest Jamba ai21.jamba-1-5-mini-v1:0Compact Jamba
Stability AI
Model ID Type Description stability.stable-diffusion-xl-v1Image SDXL 1.0 stability.sd3-large-v1:0Image Stable Diffusion 3
Authentication Methods
1. Access Keys (Default)
client = Portkey(
provider = "bedrock" ,
aws_access_key_id = "AKIA***" ,
aws_secret_access_key = "***" ,
aws_session_token = "***" , # Optional for temporary credentials
aws_region = "us-east-1"
)
2. Assumed Role
client = Portkey(
provider = "bedrock" ,
aws_auth_type = "assumedRole" ,
aws_role_arn = "arn:aws:iam::123456789012:role/BedrockRole" ,
aws_external_id = "external-id" , # Optional
aws_region = "us-east-1"
)
3. IAM Role (EC2, ECS, Lambda)
# Automatically uses instance/container IAM role
client = Portkey(
provider = "bedrock" ,
aws_region = "us-east-1"
)
4. Environment Variables
export AWS_ACCESS_KEY_ID = "AKIA***"
export AWS_SECRET_ACCESS_KEY = "***"
export AWS_REGION = "us-east-1"
client = Portkey( provider = "bedrock" )
Advanced Features
Streaming
stream = client.chat.completions.create(
model = "anthropic.claude-3-5-sonnet-20241022-v2:0" ,
messages = [{ "role" : "user" , "content" : "Count to 10" }],
stream = True
)
for chunk in stream:
if chunk.choices[ 0 ].delta.content:
print (chunk.choices[ 0 ].delta.content, end = "" )
Function Calling (Converse API)
tools = [
{
"type" : "function" ,
"function" : {
"name" : "get_weather" ,
"description" : "Get current weather" ,
"parameters" : {
"type" : "object" ,
"properties" : {
"location" : { "type" : "string" }
},
"required" : [ "location" ]
}
}
}
]
response = client.chat.completions.create(
model = "anthropic.claude-3-5-sonnet-20241022-v2:0" ,
messages = [{ "role" : "user" , "content" : "What's the weather in Tokyo?" }],
tools = tools
)
Embeddings
response = client.embeddings.create(
model = "amazon.titan-embed-text-v2:0" ,
input = "AWS Bedrock provides access to foundation models"
)
embedding = response.data[ 0 ].embedding
print ( f "Dimensions: { len (embedding) } " )
Image Generation
response = client.images.generate(
model = "stability.sd3-large-v1:0" ,
prompt = "A serene mountain landscape at sunset" ,
size = "1024x1024"
)
image_url = response.data[ 0 ].url
Batch Inference
Create batch jobs for cost-effective inference:
# Create batch job
response = client.batches.create(
model = "anthropic.claude-3-5-sonnet-20241022-v2:0" ,
input_file_id = "s3://my-bucket/input.jsonl" ,
output_data_config = {
"s3OutputDataConfig" : {
"s3Uri" : "s3://my-bucket/output/"
}
}
)
batch_id = response.id
# Check status
batch = client.batches.retrieve(batch_id)
print ( f "Status: { batch.status } " )
Cross-Region Inference
Use inference profiles for cross-region routing:
response = client.chat.completions.create(
model = "us.anthropic.claude-3-5-sonnet-20241022-v2:0" , # Inference profile
messages = [{ "role" : "user" , "content" : "Hello" }]
)
Multi-Region Configuration
Load balance across AWS regions:
config = {
"strategy" : { "mode" : "loadbalance" },
"targets" : [
{
"provider" : "bedrock" ,
"aws_access_key_id" : "AKIA***" ,
"aws_secret_access_key" : "***" ,
"aws_region" : "us-east-1" ,
"weight" : 0.5
},
{
"provider" : "bedrock" ,
"aws_access_key_id" : "AKIA***" ,
"aws_secret_access_key" : "***" ,
"aws_region" : "us-west-2" ,
"weight" : 0.5
}
]
}
client = Portkey().with_options( config = config)
Fallback Configuration
Fallback from Bedrock Claude to Anthropic:
config = {
"strategy" : { "mode" : "fallback" },
"targets" : [
{
"provider" : "bedrock" ,
"aws_access_key_id" : "AKIA***" ,
"aws_secret_access_key" : "***" ,
"aws_region" : "us-east-1" ,
"override_params" : { "model" : "anthropic.claude-3-5-sonnet-20241022-v2:0" }
},
{
"provider" : "anthropic" ,
"api_key" : "sk-ant-***" ,
"override_params" : { "model" : "claude-3-5-sonnet-20241022" }
}
]
}
client = Portkey().with_options( config = config)
Error Handling
from portkey_ai.exceptions import (
RateLimitError,
APIError,
AuthenticationError
)
try :
response = client.chat.completions.create(
model = "anthropic.claude-3-5-sonnet-20241022-v2:0" ,
messages = [{ "role" : "user" , "content" : "Hello" }]
)
except AuthenticationError as e:
print ( f "AWS credentials error: { e } " )
except RateLimitError as e:
print ( f "Rate limit or quota exceeded: { e } " )
except APIError as e:
print ( f "Bedrock API error: { e } " )
Best Practices
Use IAM roles - More secure than access keys
Enable VPC endpoints - Private connectivity
Request model access - Models require explicit access approval
Use inference profiles - Better availability and routing
Monitor with CloudWatch - Track usage and costs
Set up guardrails - Content filtering and safety
Use batch inference - Cost-effective for large workloads
Implement retry logic - Handle throttling gracefully
Model Access
Before using models, request access in the AWS Console:
Go to AWS Bedrock Console
Navigate to Model access
Click Manage model access
Select models and request access
Wait for approval (usually instant)
Models are region-specific. Request access in each region you plan to use.
Regional Availability
Bedrock is available in multiple AWS regions:
US : us-east-1, us-west-2
Europe : eu-central-1, eu-west-1, eu-west-3
Asia Pacific : ap-southeast-1, ap-northeast-1, ap-south-1
Model availability varies by region. Check the AWS Bedrock documentation for details.
Pricing
Bedrock pricing includes:
On-demand : Pay per request/token
Provisioned throughput : Reserved capacity
Model customization : Additional costs for fine-tuning
AWS Bedrock Pricing View detailed Bedrock pricing
Anthropic Direct Anthropic integration
Load Balancing Multi-region load balancing
Guardrails Content filtering
Batch Processing Batch inference guide