Skip to main content

Overview

Graphiti’s LLM client architecture provides a unified interface for interacting with multiple language model providers. All clients extend the base LLMClient class and support structured output generation using Pydantic models.

Key Features

  • Unified Interface: Single API across OpenAI, Anthropic, Gemini, and Azure OpenAI
  • Structured Output: Automatic JSON schema generation from Pydantic models
  • Automatic Retries: Built-in retry logic for transient failures
  • Token Tracking: Monitor input/output token usage across providers
  • Response Caching: Optional caching of LLM responses
  • Tracing Support: OpenTelemetry-compatible tracing
  • Multilingual Support: Automatic language preservation in extractions

Base Client Architecture

All LLM clients inherit from LLMClient (defined in graphiti_core/llm_client/client.py) which provides:

Core Methods

generate_response() Generate a structured response from the language model.
messages
list[Message]
required
List of message objects with role and content fields
response_model
type[BaseModel] | None
Optional Pydantic model for structured output validation
max_tokens
int | None
Maximum tokens to generate (uses config default if not specified)
model_size
ModelSize
default:"ModelSize.medium"
Size of model to use (small or medium)
group_id
str | None
Optional partition identifier for the graph
prompt_name
str | None
Optional name for tracing and token tracking
Returns: dict[str, Any] - Parsed response matching the response_model schema

Configuration

config
LLMConfig
Configuration object with the following fields:
  • api_key (str | None): Authentication key for the LLM API
  • model (str | None): Primary model name
  • small_model (str | None): Model for simpler prompts
  • base_url (str | None): Custom API endpoint
  • temperature (float): Sampling temperature (default: 1.0)
  • max_tokens (int): Maximum output tokens (default: 16384)
cache
bool
default:"false"
Enable response caching (stored in ./llm_cache)

Available Clients

OpenAI

GPT-4.1, GPT-5, and compatible models

Anthropic

Claude 3.7, 4.5, and Haiku models

Gemini

Gemini 2.5, 3.0 Flash, and Pro models

Azure OpenAI

OpenAI models via Azure endpoint

Error Handling

All clients implement consistent error handling:
  • RateLimitError: Rate limit exceeded (no retry)
  • RefusalError: Model refused to respond (no retry)
  • Transient Errors: Automatic retry with exponential backoff (max 4 attempts)
  • Validation Errors: Retry with error context for models to self-correct

Token Usage Tracking

All clients track token usage through the token_tracker attribute:
from graphiti_core.llm_client import OpenAIClient
from graphiti_core.llm_client.config import LLMConfig

client = OpenAIClient(config=LLMConfig(api_key="sk-..."))

# After making requests
usage = client.token_tracker.get_usage()
print(f"Total tokens: {usage['total_tokens']}")
print(f"Input tokens: {usage['input_tokens']}")
print(f"Output tokens: {usage['output_tokens']}")

Tracing

Set a custom tracer for observability:
from graphiti_core.tracer import OpenTelemetryTracer

client = OpenAIClient()
client.set_tracer(OpenTelemetryTracer())
All generate_response() calls will create spans with attributes:
  • llm.provider: Provider name
  • model.size: Model size used
  • max_tokens: Token limit
  • cache.hit: Whether response was cached
  • prompt.name: Custom prompt identifier

Model Size Selection

Graphiti uses two-tier model sizing:
  • ModelSize.medium: Primary model for complex reasoning (default)
  • ModelSize.small: Faster, cheaper model for simple tasks
Configure both in LLMConfig:
config = LLMConfig(
    model="gpt-4.1-mini",        # Medium model
    small_model="gpt-4.1-nano"   # Small model
)

Input Sanitization

All clients automatically clean input text:
  • Removes invalid Unicode characters
  • Strips zero-width characters (\u200b, \u200c, \u200d, \ufeff, \u2060)
  • Filters control characters (except newlines, tabs, carriage returns)
This ensures reliable processing across all providers.

Build docs developers (and LLMs) love