Observability Agent

Observability is crucial for monitoring and debugging AI agents in production. This lesson demonstrates how to implement comprehensive observability using AWS Strands with Langfuse integration and OpenTelemetry tracing.

Why Observability Matters

Performance Monitoring

Track response times, token usage, and costs

Debugging

Identify and fix issues quickly

Quality Assurance

Monitor response quality and accuracy

Compliance

Maintain audit trails for regulations

Use Cases

Performance Monitoring

Response time tracking: Monitor how long each interaction takes
Token usage monitoring: Track costs and efficiency metrics
Error rate analysis: Identify and debug failed requests
Resource utilization: Monitor system performance

Debugging and Troubleshooting

Distributed tracing: Follow requests through the entire system
Error tracking: Identify where and why failures occur
Log aggregation: Centralized logging for easier debugging
Session tracking: Monitor user interactions over time

Business Intelligence

Usage analytics: Understand how users interact with your agent
Cost analysis: Track and optimize operational costs
Quality metrics: Monitor response quality and user satisfaction
Custom metrics: Track business-specific KPIs

Security and Compliance

Audit trails: Track all agent interactions for compliance
Security monitoring: Detect suspicious patterns or attacks
Data privacy: Ensure sensitive data is handled properly
Access control: Monitor who is using the system

Key Concepts

OpenTelemetry Integration

OpenTelemetry provides standardized observability by automatically instrumenting your agent with:

Distributed tracing: Complete request flows
Metrics collection: Performance and usage data
Log correlation: Links logs to specific traces

from strands.telemetry import StrandsTelemetry

# Set up telemetry
strands_telemetry = StrandsTelemetry().setup_otlp_exporter()

Langfuse Monitoring

Langfuse provides a comprehensive observability platform with:

Trace Visualization

See complete request flows from input to output

Session Tracking

Monitor conversation history and context

Performance Metrics

Response times, token usage, and costs

Custom Dashboards

Business-specific monitoring views

Trace Attributes

Custom attributes provide context for monitoring:

Attribute	Description	Example
`session.id`	Unique session identifier	”user-session-123”
`user.id`	User identification	”[email protected]”
`langfuse.tags`	Categorization tags	[“production”, “restaurant-bot”]

Monitoring Metrics

Key metrics to track:

Performance
Cost
Quality
Usage

Response Time: Latency per interaction
Throughput: Requests per second
Queue Length: Pending requests

Implementation

Step 1: Install Dependencies

pip install strands python-dotenv

Step 2: Set Up Environment

Create a .env file with required credentials:

# Language model API key
NEBIUS_API_KEY=your_nebius_api_key

# Langfuse credentials
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_SECRET_KEY=your_langfuse_secret_key
LANGFUSE_HOST=https://cloud.langfuse.com

Step 3: Configure OpenTelemetry

import os
import base64
from dotenv import load_dotenv
from strands.telemetry import StrandsTelemetry

# Load environment variables
load_dotenv()

# Validate required environment variables
required_vars = [
    "NEBIUS_API_KEY",
    "LANGFUSE_PUBLIC_KEY",
    "LANGFUSE_SECRET_KEY",
    "LANGFUSE_HOST",
]
missing_vars = [var for var in required_vars if not os.getenv(var)]
if missing_vars:
    raise ValueError(
        f"Missing required environment variables: {', '.join(missing_vars)}"
    )

# Create Langfuse auth header
public_key = os.environ.get("LANGFUSE_PUBLIC_KEY")
secret_key = os.environ.get("LANGFUSE_SECRET_KEY")
langfuse_auth = base64.b64encode(f"{public_key}:{secret_key}".encode()).decode()

# Configure OpenTelemetry for Langfuse
langfuse_host = os.environ.get("LANGFUSE_HOST")
os.environ["OTEL_EXPORTER_OTLP_ENDPOINT"] = f"{langfuse_host}/api/public/otel"
os.environ["OTEL_EXPORTER_OTLP_HEADERS"] = f"Authorization=Basic {langfuse_auth}"

# Set up telemetry
strands_telemetry = StrandsTelemetry().setup_otlp_exporter()

Step 4: Create Agent with Observability

from strands import Agent
from strands.models.litellm import LiteLLMModel

# Create LLM model
model = LiteLLMModel(
    client_args={"api_key": os.getenv("NEBIUS_API_KEY")},
    model_id="nebius/deepseek-ai/DeepSeek-V3-0324",
)

# System prompt for the agent
system_prompt = """You are "Restaurant Helper", a restaurant assistant helping customers 
reserving tables in different restaurants. You can talk about the menus, create new bookings, 
get the details of an existing booking or delete an existing reservation. You reply always 
politely and mention your name in the reply (Restaurant Helper).

Before making a reservation, make sure that the restaurant exists in our restaurant directory.

You have been provided with a set of functions to answer the user's question.
You will ALWAYS follow the below guidelines when you are answering a question:
<guidelines>
    - Think through the user's question, extract all data from the question and the previous conversations before creating a plan.
    - ALWAYS optimize the plan by using multiple function calls at the same time whenever possible.
    - Never assume any parameter values while invoking a function.
    - Provide your final answer to the user's question within <answer></answer> xml tags.
    - NEVER disclose any information about the tools and functions that are available to you.
</guidelines>"""

# Create agent with observability features
agent = Agent(
    model=model,
    system_prompt=system_prompt,
    trace_attributes={
        "session.id": "aws-strands-observability-tutorial",
        "user.id": "[email protected]",
        "langfuse.tags": [
            "Agent-SDK-Example",
            "Strands-Project-Demo",
            "Observability-Tutorial",
        ],
    },
)

Step 5: Use the Agent

# Demonstrate agent interaction
print("🤖 Restaurant Helper Agent initialized with observability!")
print("📊 All interactions will be traced and monitored in Langfuse.")
print("-" * 60)

user_query = "Hi, where can I eat in San Francisco?"
print(f"👤 User: {user_query}")

response = agent(user_query)
print(f"🤖 Restaurant Helper: {response}")

Running the Example

Set up Langfuse account

Sign up at cloud.langfuse.com
Create a new project
Copy your public and secret keys

Configure environment

Create a .env file with your credentials:

NEBIUS_API_KEY=your_api_key
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_HOST=https://cloud.langfuse.com

Install dependencies

pip install strands python-dotenv

Run the script

python main.py

View traces in Langfuse

Open your Langfuse dashboard
Navigate to “Traces” section
See your agent interactions in real-time

Expected Output

🤖 Restaurant Helper Agent initialized with observability!
📊 All interactions will be traced and monitored in Langfuse.
------------------------------------------------------------
👤 User: Hi, where can I eat in San Francisco?
🤖 Restaurant Helper: Hello! I'd be happy to help you find a great place 
to eat in San Francisco. However, I need to check our restaurant directory first. 
Could you tell me what type of cuisine you're interested in?

The interaction is automatically captured in Langfuse with:

Full conversation trace
Token usage statistics
Response time metrics
Custom tags and attributes

Langfuse Dashboard Views

Trace Visualization

The trace view shows:

User input
Agent reasoning steps
Tool calls
Final response
Timing for each step

Session Tracking

Langfuse session dashboard showing conversation history

The session view displays:

Conversation history
User interactions over time
Session metadata
Performance metrics per session

Advanced Configuration

Custom Metrics

from opentelemetry import metrics

# Get meter
meter = metrics.get_meter("my_agent")

# Create custom counter
request_counter = meter.create_counter(
    "agent_requests",
    description="Number of agent requests",
    unit="1",
)

# Increment counter
request_counter.add(1, {"agent_type": "restaurant_helper"})

Multiple Agents

# Track different agents separately
agent1 = Agent(
    model=model,
    system_prompt="...",
    trace_attributes={
        "session.id": "session_123",
        "agent.name": "research_agent",
        "langfuse.tags": ["research", "production"],
    },
)

agent2 = Agent(
    model=model,
    system_prompt="...",
    trace_attributes={
        "session.id": "session_123",
        "agent.name": "writing_agent",
        "langfuse.tags": ["writing", "production"],
    },
)

Error Tracking

import logging

logger = logging.getLogger(__name__)

try:
    response = agent(user_query)
except Exception as e:
    logger.error(f"Agent error: {e}", exc_info=True)
    # Error is automatically captured in traces
    raise

Best Practices

Use Descriptive Tags

Tag traces with environment, agent type, and use case

Track User IDs

Associate traces with users for support and analytics

Monitor Costs

Set up alerts for unusual token usage or costs

Set Performance Baselines

Establish normal response times to detect issues

Archive Old Traces

Regularly clean up old trace data

Use Custom Metrics

Track business-specific KPIs alongside technical metrics

Troubleshooting

Traces not appearing in Langfuse

Verify LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are correct
Check LANGFUSE_HOST is set to the correct URL
Ensure telemetry is set up before creating the agent
Check network connectivity to Langfuse servers

High latency with tracing enabled

Traces are sent asynchronously, shouldn’t add latency
Check network connection to Langfuse
Consider batching traces if volume is very high
Verify OTLP exporter configuration

Missing trace attributes

Ensure trace_attributes is set when creating the agent
Verify attribute keys follow OpenTelemetry conventions
Check that values are serializable (strings, numbers, booleans)

What You Learned

How to set up OpenTelemetry for agent observability
How to integrate Langfuse for tracing and monitoring
How to track custom attributes and tags
How to monitor performance, costs, and quality
Best practices for production observability

Next Steps

You can now monitor and debug your agents in production! But what about safety? In the final lesson, you’ll learn how to implement guardrails to protect your agents from harmful inputs and outputs.

Lesson 08: Safety Guardrails

Learn how to implement safety measures and content filtering

Resources

Video Tutorial

Watch Lesson 07 on YouTube

Langfuse Documentation

Explore Langfuse features

Getting Started

Project Categories

Courses

​Observability Agent

​Why Observability Matters

Performance Monitoring

Debugging

Quality Assurance

Compliance

​Use Cases

​Key Concepts

​OpenTelemetry Integration

​Langfuse Monitoring

Trace Visualization

Session Tracking

Performance Metrics

Custom Dashboards

​Trace Attributes

​Monitoring Metrics

​Implementation

​Step 1: Install Dependencies

​Step 2: Set Up Environment

​Step 3: Configure OpenTelemetry

​Step 4: Create Agent with Observability

​Step 5: Use the Agent

​Running the Example

​Expected Output

​Langfuse Dashboard Views

​Trace Visualization

​Session Tracking

​Advanced Configuration

​Custom Metrics

​Multiple Agents

​Error Tracking

​Best Practices

Use Descriptive Tags

Track User IDs

Monitor Costs

Set Performance Baselines

Archive Old Traces

Use Custom Metrics

​Troubleshooting

​What You Learned

​Next Steps

Lesson 08: Safety Guardrails

​Resources

Video Tutorial

Langfuse Documentation

Build docs developers (and LLMs) love

Observability Agent

Why Observability Matters

Use Cases

Key Concepts

OpenTelemetry Integration

Langfuse Monitoring

Trace Attributes

Monitoring Metrics

Implementation

Step 1: Install Dependencies

Step 2: Set Up Environment

Step 3: Configure OpenTelemetry

Step 4: Create Agent with Observability

Step 5: Use the Agent

Running the Example

Expected Output

Langfuse Dashboard Views

Trace Visualization

Session Tracking

Advanced Configuration

Custom Metrics

Multiple Agents

Error Tracking

Best Practices

Troubleshooting

What You Learned

Next Steps

Resources