autogen_ext package provides model clients, integrations, and extensions for AutoGen.
Model Clients
OpenAIChatCompletionClient
OpenAIChatCompletionClient
Client for OpenAI chat completion models.
Model identifier (e.g., “gpt-4o”, “gpt-4-turbo”, “gpt-3.5-turbo”)
OpenAI API key (defaults to OPENAI_API_KEY env var)
Custom API base URL
Sampling temperature (0.0 to 2.0)
Maximum tokens in response
Nucleus sampling parameter
Request timeout in seconds
OpenAI organization ID
Methods
Generate a chat completionReturns:
CreateResultStream chat completion chunks
AzureOpenAIChatCompletionClient
AzureOpenAIChatCompletionClient
Client for Azure OpenAI Service.
Model identifier
Azure OpenAI resource endpoint
Azure OpenAI API key (or use Azure AD auth)
Azure OpenAI API version
Deployment name in Azure
Azure Active Directory token
Function to provide Azure AD tokens
AnthropicChatCompletionClient
AnthropicChatCompletionClient
OllamaChatCompletionClient
OllamaChatCompletionClient
Client for local Ollama models.
Ollama model name
Ollama server URL (default: “http://localhost:11434”)
Sampling temperature
SemanticKernelChatCompletionClient
SemanticKernelChatCompletionClient
LlamaCppChatCompletionClient
LlamaCppChatCompletionClient
ReplayChatCompletionClient
ReplayChatCompletionClient
Client that replays recorded responses for testing.
Pre-recorded responses to replay in order
Model Configuration
OpenAIClientConfiguration
OpenAIClientConfiguration
Configuration dataclass for OpenAI clients.
AzureOpenAIClientConfiguration
AzureOpenAIClientConfiguration
Configuration for Azure OpenAI clients.
Caching
Code Execution
DockerCommandLineCodeExecutor
DockerCommandLineCodeExecutor
Tools & Extensions
WebSearchTool
WebSearchTool
Tool for web searching.
FileTools
FileTools
Tools for file operations.
Memory Extensions
VectorMemory
VectorMemory
Vector-based semantic memory.
RedisMemory
RedisMemory
Redis-backed persistent memory.
Runtimes & Deployment
GrpcAgentRuntime
GrpcAgentRuntime
Distributed runtime using gRPC.
CloudRuntime
CloudRuntime
Cloud-based runtime for distributed agents.
Utilities
RateLimiter
RateLimiter
Rate limiting for API calls.
TokenCounter
TokenCounter
Count tokens for cost estimation.
Configuration Models
All model clients support declarative configuration through the component system:See Also
- autogen_core - Core runtime and messaging
- autogen_agentchat - High-level agent framework