Why orchestrate agents with ZenML?
AI agents introduce unique challenges that traditional MLOps tooling wasn’t designed to handle: Reproducibility: Agent behaviors can vary dramatically between runs due to LLM non-determinism, tool usage, and dynamic decision-making. ZenML captures the complete execution context including prompts, tool calls, and responses. Observability: Understanding what agents did and why requires tracking more than just inputs and outputs. ZenML logs intermediate steps, decision points, and artifact versioning. Deployment: Agents need to run as both batch processes (analyzing datasets) and real-time services (responding to user queries). ZenML supports both patterns with the same pipeline code. Evaluation: Comparing agent architectures requires systematic testing across diverse scenarios. ZenML pipelines enable reproducible agent comparisons with versioned datasets and metrics.Quick start
Here’s a minimal example deploying a LangGraph agent with ZenML:Agent orchestration patterns
ZenML supports multiple agent orchestration patterns:Batch processing
Process collections of queries for evaluation, data labeling, or batch inference:Real-time serving
Deploy agents as HTTP endpoints for production applications:Multi-agent systems
Orchestrate multiple specialized agents working together:Agent evaluation
Systematically compare different agent configurations:Key capabilities
Framework agnostic
Works with LangGraph, CrewAI, LangChain, LlamaIndex, PydanticAI, and any Python-based agent framework
Production deployment
Deploy agents as HTTP APIs with a single command. Support for Docker, Kubernetes, and cloud platforms
Artifact management
Version and track all agent inputs, outputs, prompts, and intermediate results with automatic storage
Evaluation pipelines
Build reproducible evaluation workflows to compare agent architectures and configurations
Observability
Track agent executions, tool usage, token consumption, and costs with integration support for Langfuse
Hybrid architectures
Combine LLM agents with traditional ML models for cost-effective, specialized workflows
Framework support
ZenML integrates seamlessly with popular agent frameworks:- LangGraph: Graph-based agent workflows with state management
- LangChain: Composable chains and ReAct agents
- CrewAI: Multi-agent crews with role-based collaboration
- LlamaIndex: Function-based agents with async support
- PydanticAI: Type-safe agents with structured outputs
- Haystack: RAG pipelines with retrieval components
- OpenAI Agents SDK: Official OpenAI agent implementation
- Semantic Kernel: Microsoft’s plugin-based architecture
- Autogen: Conversational multi-agent systems
- AWS Strands: Simple agent execution on AWS Bedrock
- Qwen-Agent: Function calling with Qwen models
- Google ADK: Gemini-powered agents
Real-world example: Customer support agent
The agent comparison example demonstrates a complete production workflow:- Load test data: Real customer service queries
- Train intent classifier: Traditional ML model for routing
- Define agent architectures: Single RAG, multi-specialist, and LangGraph
- Run evaluation: Compare all architectures on the same dataset
- Generate report: HTML visualization with metrics and workflow diagrams
- Performance metrics (latency, confidence, accuracy)
- Token usage and cost analysis
- Interactive Mermaid diagrams of each architecture
- Comprehensive HTML comparison report
Next steps
Orchestrating agents
Learn the patterns and best practices for orchestrating AI agents in production
Agent frameworks
Integration guides for LangGraph, CrewAI, LangChain, and 9 other frameworks
Agent evaluation
Build reproducible evaluation pipelines to compare agent architectures
Examples
Complete working examples with deployment configurations
