What is Observability?
Observability provides insight into:- Agent invocations - Track each agent run with traces
- Function calls - Monitor tool execution and performance
- Token usage - Measure costs and quota consumption
- Errors and failures - Debug issues with detailed stack traces
- Performance metrics - Identify bottlenecks and slow operations
- Conversation flows - Visualize multi-turn interactions
Quick Start
- Python
- .NET
Traces
Traces provide a hierarchical view of agent execution:- Python
- .NET
Automatic Tracing
The framework automatically creates spans for:- Agent runs - Each
agent.run()call - Chat requests - LLM API calls
- Function invocations - Tool executions
- HTTP requests - External API calls
Custom Spans
Add custom spans for application-specific operations:Span Attributes
Agent framework automatically adds rich attributes:Metrics
Metrics provide quantitative measurements over time:- Python
- .NET
Built-in Metrics
The framework automatically emits:| Metric | Type | Description |
|---|---|---|
agent.invocation.duration | Histogram | Agent run duration (seconds) |
function.invocation.duration | Histogram | Function execution time (seconds) |
agent.token.usage | Counter | Token consumption by model |
Custom Metrics
Add application-specific metrics:Logs
Visualization & Analysis
Aspire Dashboard
The .NET Aspire Dashboard provides a local development UI for viewing telemetry:Azure Monitor (Application Insights)
- Python
- .NET
Jaeger (Distributed Tracing)
Sensitive Data
- Python
- .NET
Complete Example
- Python
- .NET
Best Practices
Observability Tips
- Always Enable in Production: Observability is essential for debugging
- Use Sampling: Sample high-volume traces to reduce costs
- Add Custom Spans: Instrument critical business operations
- Set Alerts: Monitor error rates and latency thresholds
- Correlate Logs: Use trace IDs to correlate logs with traces
- Tag Resources: Add service name and version to all telemetry
- Monitor Costs: Track token usage metrics for cost optimization
Troubleshooting
No Telemetry Appearing
- Python
- .NET
High Overhead
- Reduce sampling rate
- Disable sensitive data logging
- Use batch exporters
- Filter out low-value spans
Missing Attributes
- Enable sensitive data (development only)
- Check span attribute limits in exporter
- Verify OpenTelemetry SDK version
Next Steps
Agents
Learn about agent telemetry
Middleware
Add custom telemetry with middleware
Tools
Monitor tool execution
Sessions
Track session lifecycle