From Development to Production
Moving an agent system to production requires careful planning around observability, evaluation, and scalability. The transition from local development to production involves:- Trace Management: Systematically capturing and uploading agent execution traces
- Online Evaluation: Continuous monitoring of agent performance in production
- Scale Considerations: Handling increased volume and operational requirements
Production Readiness Checklist
Establish Observability
Configure LangSmith or equivalent tracing to capture all agent interactions. Ensure traces include:
- Input/output data
- Execution timing
- Error states
- Custom metadata (user IDs, session IDs, etc.)
Set Up Evaluation Pipeline
Deploy online evaluators that run automatically on production traces:
- Response quality metrics
- Latency thresholds
- Error rate monitoring
- Custom business logic validators
Implement Trace Upload
Build a robust system for uploading traces at scale. See Trace Upload for implementation details.
Key Production Patterns
Synthetic Data for Testing
Before deploying to production, generate synthetic traces that mirror expected production patterns:Time-Shifted Trace Upload
When uploading historical or synthetic traces, shift timestamps to appear recent:ID Regeneration
When re-uploading traces, generate fresh IDs while preserving parent-child relationships:Operational Excellence
Batching and Flushing
Always batch trace uploads and explicitly flush when complete:Error Handling
Production agents must gracefully handle failures:Production Deployment Strategies
Blue-Green Deployment
- Deploy new agent version alongside existing version
- Route small percentage of traffic to new version
- Monitor evaluation metrics for regressions
- Gradually increase traffic or rollback if issues detected
Canary Releases
- Deploy to single region or customer segment
- Run online evaluations continuously
- Compare metrics against baseline
- Full rollout only after validation period
Shadow Mode
- Run new agent version in parallel without serving responses
- Compare outputs against production agent
- Evaluate differences and edge cases
- Promote once confidence is established
Next Steps
Trace Upload
Implement scalable trace upload systems
Online Evaluation
Deploy continuous evaluation pipelines