Overview
Junkie includes built-in observability through Phoenix tracing and structured logging. This allows you to monitor agent behavior, debug issues, and optimize performance in production.Phoenix Tracing
Phoenix (Arize) provides distributed tracing for LLM applications, tracking agent runs, model calls, and tool usage.Configuration
Phoenix tracing is configured via environment variables:core/config.py:26-32
Setup Implementation
The tracing setup is implemented incore/observability.py:
core/observability.py:7-55
Key Features
- Lazy Initialization: Phoenix is only imported when tracing is enabled
- Automatic Instrumentation:
auto_instrument=Truetraces LLM calls automatically - Batching:
batch=Trueimproves performance by batching traces - Error Handling: Graceful fallback if Phoenix is unavailable
- Singleton Pattern: Tracing is initialized once and reused
Using Phoenix
Enable in Production
Access Phoenix Dashboard
- Go to app.phoenix.arize.com
- Navigate to your project (e.g., “junkie-prod”)
- View traces, spans, and metrics
What Gets Traced
Withauto_instrument=True, Phoenix automatically traces:
- LLM Calls: Model requests/responses (OpenAI, Groq, etc.)
- Agent Runs: Full agent execution flows
- Tool Usage: Tool calls and results
- Embeddings: Vector operations (if used)
- Retrieval: RAG queries (if used)
Trace Data
Each trace includes:- Span ID: Unique identifier
- Parent Span: Hierarchical relationships
- Duration: Execution time
- Attributes: Model name, temperature, tokens, etc.
- Events: Errors, warnings, custom events
- Status: Success/error status
Performance Considerations
Batching
Thebatch=True setting groups traces before sending:
- Lower network overhead
- Reduced impact on application performance
- Better throughput for high-volume applications
Conditional Loading
Phoenix dependencies are only loaded when tracing is enabled:Logging Configuration
Debug Mode
Control logging verbosity via environment variables:core/config.py:21-22
Production Logging
Recommended settings for production:Log Levels
Junkie uses Python’s standard logging:DEBUG: Detailed information for debuggingINFO: General informational messagesWARNING: Warning messages (recoverable issues)ERROR: Error messages (with stack traces)CRITICAL: Critical failures
Viewing Logs
Railway
- Go to your project
- Click on the service
- Navigate to “Logs” tab
- Filter by level or search
Docker
Local Development
Logs are printed to stdout/stderr. Redirect to a file:Observability Best Practices
1. Always Enable Tracing in Production
2. Use Structured Logging
3. Monitor Key Metrics
Track:- Agent success/failure rates
- Average execution time
- Token usage and costs
- API error rates
- Database query performance
4. Set Up Alerts
Configure Phoenix or logging alerts for:- High error rates
- Slow agent runs (> threshold)
- API quota warnings
- Database connection failures
5. Disable Debug Logs in Production
6. Use Separate Projects per Environment
Troubleshooting
Phoenix Not Tracing
Check if tracing is enabled:TRACING=falseor not set- Missing or invalid
PHOENIX_API_KEY arize-phoenixpackage not installed (checkrequirements.txt)- Network issues connecting to Phoenix endpoint
High Overhead from Tracing
Ensure batching is enabled: Checkcore/observability.py:38:
- Sample traces (e.g., 10% of requests)
- Filter out low-value spans
- Adjust batch size/timeout
Missing Traces
Check initialization: Look for this log message:TRACINGenvironment variable- Import errors in logs
- Phoenix initialization exceptions
auto_instrument=True in core/observability.py:37.
Next Steps
- Environment Setup - Configure tracing variables
- Troubleshooting - Debug tracing issues
- Phoenix Documentation - Learn more about Phoenix