Skip to main content
The Model class is the abstract base for all language model integrations in Agno. It defines the interface for invoking models, streaming responses, and handling tool calls.

Core Attributes

id
str
required
Model identifier (e.g., “gpt-4o”, “claude-3-5-sonnet”).
name
str
default:"None"
Optional display name for the model.
provider
str
default:"None"
Model provider name (e.g., “OpenAI”, “Anthropic”).

Configuration

tool_choice
str | Dict[str, Any]
default:"None"
Controls which tool is called: “none”, “auto”, or specific tool.
system_prompt
str
default:"None"
System prompt added to the agent.
instructions
List[str]
default:"None"
Additional instructions added to the agent.

Caching

cache_response
bool
default:"False"
Cache model responses to avoid redundant API calls during development.
cache_ttl
int
default:"None"
Time-to-live for cached responses in seconds.
cache_dir
str
default:"None"
Directory to store cache files.

Retries

retries
int
default:"0"
Number of retries to attempt when a ModelProviderError occurs.
delay_between_retries
int
default:"1"
Delay in seconds between retries.
exponential_backoff
bool
default:"False"
If True, delay doubles after each retry.
retry_with_guidance
bool
default:"True"
Enable retrying with guidance message for known avoidable errors.

Methods

invoke()

Synchronous model invocation.
response = model.invoke(
    messages=messages,
    tools=tools,
    response_format=output_schema
)

ainvoke()

Asynchronous model invocation.
response = await model.ainvoke(
    messages=messages,
    tools=tools
)

invoke_stream()

Stream model responses synchronously.
for chunk in model.invoke_stream(messages=messages):
    print(chunk.content, end="")

ainvoke_stream()

Stream model responses asynchronously.
async for chunk in model.ainvoke_stream(messages=messages):
    print(chunk.content, end="")

count_tokens()

Count tokens in messages.
tokens = model.count_tokens(
    messages=messages,
    tools=tools,
    output_schema=schema
)

Supported Models

Agno provides integrations for major language model providers:
  • OpenAI: OpenAIChat, OpenAIReasoning
  • Anthropic: Claude
  • Google: Gemini
  • Groq: Groq
  • Azure: AzureOpenAIChat
  • AWS: AWSBedrock
  • Ollama: Ollama
  • XAI: xAI

Example Usage

from agno.models.openai import OpenAIChat

model = OpenAIChat(
    id="gpt-4o",
    retries=3,
    cache_response=True
)

Build docs developers (and LLMs) love