Overview
Middleware in the Microsoft Agent Framework .NET SDK provides a powerful way to intercept and enhance agent behavior. You can add logging, content filtering, function approval, context enrichment, and custom processing at different layers of the agent pipeline.Middleware Layers
The framework supports middleware at three levels:- Chat Client Level: Intercepts IChatClient calls before they reach the AI provider
- Agent Run Level: Intercepts agent run requests and responses
- Function Invocation Level: Intercepts individual function/tool calls
Agent Run Middleware
Basic Middleware
Agent run middleware intercepts the entire agent execution:PII Filtering Middleware
Redact sensitive information from messages:Content Guardrails Middleware
Enforce content policies:Chaining Multiple Middleware
Function Invocation Middleware
Intercept individual function calls:Function Result Override
Modify or override function results:Function Approval (Human-in-the-Loop)
Require approval before executing sensitive functions:Approval via Middleware
Implement approval workflow using middleware:Chat Client Middleware
Intercept low-level chat client calls:Per-Request Middleware
Add middleware for specific requests:AIContextProvider Middleware
Enrich agent context with additional information:Structured Output Middleware
Add structured output support to agents that don’t natively support it:Best Practices
Order Matters
Order Matters
Middleware executes in the order added. Consider the logical flow:
Handle Both Sync and Streaming
Handle Both Sync and Streaming
When using
.Use(middleware, null), the same middleware handles both regular and streaming requests. Implement accordingly:Preserve Message Integrity
Preserve Message Integrity
When modifying messages, preserve important metadata:
Use Cancellation Tokens
Use Cancellation Tokens
Always pass cancellation tokens through the middleware chain:
Log at Appropriate Level
Log at Appropriate Level
Choose the right middleware layer:
- Chat Client: Low-level provider interactions, token usage
- Agent Run: High-level agent behavior, conversation flow
- Function: Individual tool execution, parameter validation
Avoid Heavy Processing
Avoid Heavy Processing
Middleware runs on every request. Keep processing lightweight or use async operations:
Common Middleware Patterns
Rate Limiting
Retry Logic
Metrics Collection
Next Steps
Observability
Monitor agents with OpenTelemetry
Tools
Combine middleware with function tools
Memory
Add memory with AIContextProvider
Workflows
Use middleware in workflow executors