Skip to main content
Agents are the primary abstraction in Microsoft Agent Framework for building AI-powered conversational experiences. An agent combines a language model, optional tools (functions), instructions, and configuration into a reusable component that can process user messages and generate responses.

What is an Agent?

An agent encapsulates:
  • A language model client (e.g., Azure OpenAI, Anthropic, Ollama)
  • Instructions that define the agent’s behavior and personality
  • Tools (functions) the agent can call to access external data or perform actions
  • Middleware for intercepting and modifying requests/responses
  • Session management for multi-turn conversations
In Python, agents implement the SupportsAgentRun protocol and the framework provides:
  • Agent - Main agent class with middleware and telemetry
  • RawAgent - Core agent without built-in layers
  • BaseAgent - Minimal base class for custom implementations
The Agent class is the recommended starting point for most use cases.

Creating Your First Agent

from agent_framework.azure import AzureOpenAIResponsesClient
from azure.identity import AzureCliCredential
import os

# Create a client
credential = AzureCliCredential()
client = AzureOpenAIResponsesClient(
    project_endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"],
    deployment_name=os.environ["AZURE_OPENAI_RESPONSES_DEPLOYMENT_NAME"],
    credential=credential,
)

# Create an agent
agent = client.as_agent(
    name="HelloAgent",
    instructions="You are a friendly assistant. Keep your answers brief.",
)

# Run the agent
result = await agent.run("What is the capital of France?")
print(result.text)

Agent Lifecycle

Non-Streaming Mode

# Get complete response at once
response = await agent.run("Hello, how are you?")
print(response.text)  # Full response text
print(response.messages)  # All messages in response
print(response.usage_details)  # Token usage information
The run() method returns an AgentResponse containing:
  • messages - List of response messages
  • text - Concatenated text from all messages
  • response_id - Unique identifier for this response
  • usage_details - Token usage and cost information
  • value - Structured output (if using response format)

Streaming Mode

# Stream tokens as they are generated
async for chunk in agent.run("Tell me a story", stream=True):
    if chunk.text:
        print(chunk.text, end="", flush=True)

# Or get final response after streaming
stream = agent.run("Tell me a story", stream=True)
async for chunk in stream:
    print(chunk.text, end="")
final_response = await stream.get_final_response()
print(f"\n\nTotal tokens: {final_response.usage_details}")
Streaming returns a ResponseStream that yields AgentResponseUpdate objects and provides get_final_response() to retrieve the complete response.

Agent Configuration

Instructions

Instructions define the agent’s behavior, personality, and capabilities:
agent = client.as_agent(
    name="CustomerSupportAgent",
    instructions="""You are a customer support agent.
    - Be empathetic and professional
    - Always offer to escalate complex issues
    - Keep responses under 3 sentences
    """,
)

Model Parameters

agent = Agent(
    client=client,
    name="CreativeWriter",
    instructions="You are a creative writing assistant.",
    temperature=0.9,  # Higher creativity
    max_tokens=2000,
    top_p=0.95,
)

# Or override at runtime
response = await agent.run(
    "Write a short story",
    options={"temperature": 1.0, "max_tokens": 500}
)
Available options:
  • temperature (0.0-2.0) - Controls randomness
  • max_tokens - Maximum response length
  • top_p - Nucleus sampling parameter
  • frequency_penalty - Reduces repetition
  • presence_penalty - Encourages topic diversity
  • seed - Reproducible outputs
  • stop - Stop sequences

Agent Protocol

The SupportsAgentRun protocol defines the agent interface:
from agent_framework import SupportsAgentRun

class CustomAgent:
    id: str
    name: str | None
    description: str | None

    async def run(self, messages=None, *, stream=False, session=None, **kwargs):
        # Your implementation
        ...

    def create_session(self, **kwargs):
        # Your implementation
        ...

    def get_session(self, *, service_session_id, **kwargs):
        # Your implementation
        ...

# Verify compliance
agent = CustomAgent()
assert isinstance(agent, SupportsAgentRun)
This protocol uses structural subtyping (duck typing), so classes don’t need to explicitly inherit from it.

Agent Properties

agent = Agent(
    id="custom-agent-123",  # Optional, auto-generated if not provided
    name="ResearchAgent",
    description="An agent specialized in research tasks",
    client=client,
    additional_properties={"version": "1.0", "team": "research"},
)

# Access properties
print(agent.id)  # "custom-agent-123"
print(agent.name)  # "ResearchAgent"
print(agent.description)  # "An agent specialized in research tasks"
print(agent.additional_properties["version"])  # "1.0"

Converting Agents to Tools

Agents can be converted to tools for use by other agents:
# Create a specialized agent
research_agent = client.as_agent(
    name="ResearchAgent",
    description="Performs web research and fact-checking",
    tools=web_search,
)

# Convert to a tool
research_tool = research_agent.as_tool(
    arg_name="task",
    arg_description="Research task to perform",
)

# Use in another agent
coordinator = client.as_agent(
    name="Coordinator",
    instructions="Delegate research tasks to the research agent.",
    tools=research_tool,
)

# Session propagation
research_tool = research_agent.as_tool(
    propagate_session=True  # Share session with parent agent
)

Best Practices

Agent Design Tips
  1. Clear Instructions: Be specific about the agent’s role, tone, and constraints
  2. Tool Selection: Only provide tools the agent actually needs
  3. Session Management: Reuse sessions for multi-turn conversations
  4. Error Handling: Always handle potential exceptions in production
  5. Observability: Enable telemetry to monitor agent behavior
  6. Cost Control: Set max_tokens to prevent runaway costs
Production Considerations
  • Always use approval_mode="always_require" for tools in production
  • Implement proper authentication and authorization
  • Monitor token usage and costs
  • Add rate limiting and timeout handling
  • Test agent behavior with edge cases
  • Use middleware for security checks

Next Steps

Tools

Learn how to give agents function-calling capabilities

Sessions

Manage multi-turn conversations with sessions

Middleware

Intercept and modify agent behavior with middleware

Observability

Monitor agent performance with OpenTelemetry

Build docs developers (and LLMs) love