Threads, Runs, and Messages

Foundry Agent Service uses persistent threads, runs, and messages to manage conversation states and agent execution. Understanding these components is essential for building effective agents.

Core Components

Agent

A configurable orchestration component that:

Uses AI models with instructions and tools
Processes messages in threads
Maintains conversation context
Enforces safety and governance controls

Thread

A conversation session between an agent and a user:

Stores messages (up to 100,000 per thread)
Automatically handles context truncation
Persists until explicitly deleted
Maintains conversation history

Message

Individual communication within a thread:

Created by agents or users
Can include text, images, and files
Stored in ordered list format
Supports attachments

Run

An invocation of an agent on a thread:

Processes all messages in the thread
May append new messages (agent responses)
Calls models and tools as needed
Tracks execution status

Agent Workflow

Create Agent

Define agent with model, instructions, and tools

Create Thread

Create conversation session (reuse for ongoing conversations)

Send Messages

Add user messages to the thread

Run Agent

Execute agent to process messages

Monitor Status

Poll run status until completion

Get Response

Retrieve agent’s messages from thread

Run Status Values

Status	Description
`queued`	Run is waiting to be processed
`in_progress`	Agent is actively processing
`requires_action`	Agent needs function call results
`completed`	Run finished successfully
`failed`	Run encountered an error
`cancelled`	Run was cancelled by user
`expired`	Run exceeded time limits (10 min)

Code Examples

Basic Agent Execution

from azure.ai.projects import AIProjectClient
from azure.identity import DefaultAzureCredential
import time

project = AIProjectClient(
    endpoint="https://<resource>.services.ai.azure.com/api/projects/<project>",
    credential=DefaultAzureCredential()
)

# Create agent
agent = project.agents.create_agent(
    model="gpt-4o",
    name="my-agent",
    instructions="You are a helpful assistant"
)

# Create thread
thread = project.agents.threads.create()

# Add message
message = project.agents.messages.create(
    thread_id=thread.id,
    role="user",
    content="Hello! Can you help me?"
)

# Create and monitor run
run = project.agents.runs.create(thread_id=thread.id, agent_id=agent.id)

while run.status in ["queued", "in_progress"]:
    time.sleep(1)
    run = project.agents.runs.get(thread_id=thread.id, run_id=run.id)

print(f"Run status: {run.status}")

# Get messages
if run.status == "completed":
    messages = project.agents.messages.list(thread_id=thread.id)
    for msg in messages:
        print(f"{msg['role']}: {msg['content']}")

# Cleanup
project.agents.delete_agent(agent.id)
project.agents.threads.delete(thread.id)

Using create_and_poll

# Simpler alternative that handles polling
run = project.agents.runs.create_and_poll(
    thread_id=thread.id,
    agent_id=agent.id
)

if run.status == "completed":
    messages = project.agents.messages.list(thread_id=thread.id)
    print(messages)

Thread Management

When to Create New Threads

Create a new thread when:

Starting a fresh topic or conversation
User explicitly wants to “start over”
Different users (each user should have their own thread)
Thread becomes too large (impacts performance)

Reuse existing thread when:

Continuing an ongoing conversation
Maintaining conversation context
Building on previous interactions

Thread Lifecycle

Threads persist until explicitly deleted:

# Delete thread when no longer needed
project.agents.threads.delete(thread_id=thread.id)

Storage considerations:

Threads with many messages consume storage
Plan retention strategy based on:
- Storage costs
- Compliance requirements
- Business needs

Thread Limits

Maximum 100,000 messages per thread
Automatic context truncation when needed
Performance may degrade with thousands of messages
Consider creating new threads for long conversations

Best Practices

Clean Up Resources

Delete threads and agents when no longer needed:

# Delete agent
project.agents.delete_agent(agent.id)

# Delete thread
project.agents.threads.delete(thread.id)

Handle Errors Gracefully

Always check run status and implement retry logic:

import time

max_retries = 3
for attempt in range(max_retries):
    run = project.agents.runs.create(thread_id=thread.id, agent_id=agent.id)
    
    while run.status in ["queued", "in_progress"]:
        time.sleep(1)
        run = project.agents.runs.get(thread_id=thread.id, run_id=run.id)
    
    if run.status == "completed":
        break
    elif run.status == "failed":
        if attempt < max_retries - 1:
            print(f"Run failed, retrying... (attempt {attempt + 1})")
            time.sleep(2 ** attempt)  # Exponential backoff
        else:
            print(f"Run failed after {max_retries} attempts")
            # Handle failure

Use Appropriate Polling Intervals

Start with short intervals, increase for longer operations:

import time

delay = 0.5  # Start with 500ms
max_delay = 5  # Cap at 5 seconds

while run.status in ["queued", "in_progress"]:
    time.sleep(delay)
    run = project.agents.runs.get(thread_id=thread.id, run_id=run.id)
    
    # Increase delay for next poll
    delay = min(delay * 1.5, max_delay)

Limit Message Size

Keep conversations concise for optimal performance:

Avoid extremely long messages
Summarize when threads get large
Create new threads for new topics
Monitor thread message count

Next Steps

Agent Overview

Learn about Foundry Agent Service

Environment Setup

Deploy agent infrastructure

Agent Tools

Extend agent capabilities

Quickstart

Create your first agent

Getting Started

Core Concepts

Agents

Agent Tools

Models

Solutions

Responsible AI

Threads, Runs, and Messages

Threads, Runs, and Messages

Core Components

Agent

Thread

Message

Run

Agent Workflow

Run Status Values

Code Examples

Basic Agent Execution

Using create_and_poll

Thread Management

When to Create New Threads

Thread Lifecycle

Thread Limits

Best Practices

Next Steps

Agent Overview

Environment Setup

Agent Tools

Quickstart

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Agents

Agent Tools

Models

Solutions

Responsible AI

​Threads, Runs, and Messages

​Core Components

​Agent

​Thread

​Message

​Run

​Agent Workflow

​Run Status Values

​Code Examples

​Basic Agent Execution

​Using create_and_poll

​Thread Management

​When to Create New Threads

​Thread Lifecycle

​Thread Limits

​Best Practices

​Next Steps

Agent Overview

Environment Setup

Agent Tools

Quickstart

Build docs developers (and LLMs) love

Threads, Runs, and Messages

Core Components

Agent

Thread

Message

Run

Agent Workflow

Run Status Values

Code Examples

Basic Agent Execution

Using create_and_poll

Thread Management

When to Create New Threads

Thread Lifecycle

Thread Limits

Best Practices

Next Steps