Skip to main content

Overview

LiteLLM provides a comprehensive wrapper around the Agent-to-Agent (A2A) protocol, enabling seamless communication with A2A-compliant agents. The A2A protocol standardizes how agents communicate, allowing different agent frameworks to interact seamlessly.

Key Features

  • Automatic agent card resolution
  • Built-in retry logic for localhost URLs
  • Integrated logging and cost tracking
  • Support for both streaming and non-streaming requests
  • Compatible with multiple agent providers (LangGraph, Pydantic AI, etc.)

Installation

Install the A2A SDK alongside LiteLLM:
pip install litellm a2a-sdk

Quick Start

1

Create an A2A Client

Initialize a client for your agent endpoint:
from litellm.a2a_protocol import create_a2a_client

# Create client for your agent
client = await create_a2a_client(
    base_url="http://localhost:10001",
    timeout=60.0
)
2

Send a Message

Send messages to the agent:
from litellm.a2a_protocol import asend_message
from a2a.types import SendMessageRequest, MessageSendParams
from uuid import uuid4

# Prepare request
request = SendMessageRequest(
    id=str(uuid4()),
    params=MessageSendParams(
        message={
            "role": "user",
            "parts": [{"kind": "text", "text": "Hello, agent!"}],
            "messageId": uuid4().hex,
        }
    )
)

# Send message
response = await asend_message(
    a2a_client=client,
    request=request
)

print(response.model_dump(mode='json', exclude_none=True))
3

Handle Streaming Responses

Process streaming responses from agents:
from litellm.a2a_protocol import asend_message_streaming
from a2a.types import SendStreamingMessageRequest

request = SendStreamingMessageRequest(
    id=str(uuid4()),
    params=MessageSendParams(
        message={
            "role": "user",
            "parts": [{"kind": "text", "text": "Stream me results"}],
            "messageId": uuid4().hex,
        }
    )
)

async for chunk in asend_message_streaming(
    a2a_client=client,
    request=request
):
    print(chunk)

Class-Based Interface

For more structured code, use the A2AClient class:
from litellm.a2a_protocol import A2AClient
from a2a.types import SendMessageRequest, MessageSendParams
from uuid import uuid4

# Initialize client
client = A2AClient(
    base_url="http://localhost:10001",
    timeout=60.0,
    extra_headers={"X-Custom-Header": "value"}
)

# Get agent information
agent_card = await client.get_agent_card()
print(f"Agent: {agent_card.name}")

# Send message
request = SendMessageRequest(
    id=str(uuid4()),
    params=MessageSendParams(
        message={
            "role": "user",
            "parts": [{"kind": "text", "text": "Hello!"}],
            "messageId": uuid4().hex,
        }
    )
)

response = await client.send_message(request)
print(response)

Completion Bridge

The A2A protocol includes a “completion bridge” that allows non-A2A providers to be accessed through the A2A interface:
from litellm.a2a_protocol import asend_message
from a2a.types import SendMessageRequest, MessageSendParams
from uuid import uuid4

request = SendMessageRequest(
    id=str(uuid4()),
    params=MessageSendParams(
        message={
            "role": "user",
            "parts": [{"kind": "text", "text": "Hello!"}],
            "messageId": uuid4().hex,
        }
    )
)

# Route through LangGraph using completion bridge
response = await asend_message(
    request=request,
    api_base="http://localhost:2024",
    litellm_params={
        "custom_llm_provider": "langgraph",
        "model": "agent"
    }
)
# Route through AWS Bedrock AgentCore
response = await asend_message(
    request=request,
    litellm_params={
        "custom_llm_provider": "bedrock",
        "model": "bedrock/my-agent-alias"
    }
)

Advanced Features

Agent Card Resolution

LiteLLM automatically resolves agent cards from multiple well-known paths:
from litellm.a2a_protocol import aget_agent_card

# Fetch agent card
agent_card = await aget_agent_card(
    base_url="http://localhost:10001",
    timeout=60.0
)

print(f"Name: {agent_card.name}")
print(f"Description: {agent_card.description}")
print(f"URL: {agent_card.url}")
The resolver checks these paths in order:
  1. /agent.json (standard A2A path)
  2. /.well-known/agent.json (alternative path)
  3. / (root path with JSON content-type)

Localhost URL Retry Logic

Many agents deploy with localhost URLs in their agent cards. LiteLLM automatically detects and corrects this:
# Agent card contains: http://localhost:8001/endpoint
# LiteLLM detects connection failure and retries with:
# http://your-public-domain.com/endpoint

response = await asend_message(
    a2a_client=client,
    request=request
)
# Automatically retries with corrected URL

Cost Tracking

LiteLLM tracks token usage and costs for A2A agents:
# Usage is automatically calculated from request/response
response = await asend_message(
    a2a_client=client,
    request=request,
    agent_id="my-agent-123"  # For spend tracking
)

# Access hidden params including usage
print(response._hidden_params["usage"])

Custom Headers

Pass custom headers for authentication or tracing:
client = await create_a2a_client(
    base_url="http://localhost:10001",
    extra_headers={
        "X-LiteLLM-Trace-Id": "trace-123",
        "X-LiteLLM-Agent-Id": "agent-456",
        "Authorization": "Bearer your-token"
    }
)

Error Handling

from litellm.a2a_protocol import (
    asend_message,
    A2AConnectionError,
    A2ALocalhostURLError
)

try:
    response = await asend_message(
        a2a_client=client,
        request=request
    )
except A2ALocalhostURLError as e:
    print(f"Localhost URL detected: {e.localhost_url}")
    print(f"Using base URL: {e.base_url}")
    # Automatically retried
except A2AConnectionError as e:
    print(f"Connection failed: {e.message}")
    print(f"URL: {e.url}")

Proxy Integration

Use A2A agents through the LiteLLM Proxy:
config.yaml
model_list:
  - model_name: my-a2a-agent
    litellm_params:
      model: a2a_agent/agent-name
      api_base: http://localhost:10001
      custom_llm_provider: a2a_agent
Then call via the proxy:
import openai

client = openai.OpenAI(
    api_key="your-proxy-key",
    base_url="http://localhost:4000"
)

response = client.chat.completions.create(
    model="my-a2a-agent",
    messages=[{"role": "user", "content": "Hello!"}]
)

Best Practices

Create clients once and reuse them for multiple requests:
# Good: Reuse client
client = await create_a2a_client(base_url=url)
response1 = await asend_message(a2a_client=client, request=req1)
response2 = await asend_message(a2a_client=client, request=req2)

# Bad: Create new client each time
client1 = await create_a2a_client(base_url=url)
response1 = await asend_message(a2a_client=client1, request=req1)
client2 = await create_a2a_client(base_url=url)
response2 = await asend_message(a2a_client=client2, request=req2)
Always consume streaming responses fully:
# Good: Consume all chunks
async for chunk in asend_message_streaming(client, request):
    process(chunk)

# Bad: Breaking early can leave connections open
async for chunk in asend_message_streaming(client, request):
    if condition:
        break  # May leave connection hanging
Configure timeouts based on agent complexity:
# Short timeout for simple agents
client = A2AClient(base_url=url, timeout=30.0)

# Longer timeout for complex reasoning agents
client = A2AClient(base_url=url, timeout=300.0)

Reference

Source Code

  • A2A Protocol implementation: litellm/a2a_protocol/
  • Main functions: litellm/a2a_protocol/main.py:134
  • Client class: litellm/a2a_protocol/client.py:21
  • Exceptions: litellm/a2a_protocol/exceptions.py:12

Build docs developers (and LLMs) love