Skip to main content
The rLLM SDK is a lightweight toolkit for automatic LLM trace collection using session contexts and trajectory decorators. It enables you to track, manage, and analyze LLM calls across simple functions to complex multi-agent workflows.

Core Concepts

Sessions

Sessions track all LLM calls within a context for debugging and analysis. They automatically capture traces, metadata, and provide access to collected data.
from rllm.sdk import session, get_chat_client

llm = get_chat_client(api_key="sk-...")

# Create a session to track all LLM calls
with session(experiment="v1") as sess:
    response = llm.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "Hello"}]
    )
    # Access all traces from this session
    print(f"Collected {len(sess.llm_calls)} traces")

Trajectories

Trajectories represent multi-step workflows where each LLM call becomes a step with assignable rewards. Use the @trajectory decorator to automatically convert function execution into structured trajectories.
from rllm.sdk import trajectory, get_chat_client_async

llm = get_chat_client_async(api_key="sk-...")

@trajectory(name="solver")
async def solve_math_problem(problem: str):
    # Each LLM call automatically becomes a step
    response1 = await llm.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": f"Solve: {problem}"}]
    )
    response2 = await llm.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "Is this correct?"}]
    )
    return response2.choices[0].message.content

# Returns TrajectoryView instead of string
traj = await solve_math_problem("What is 2+2?")
print(f"Steps: {len(traj.steps)}")  # 2
traj.steps[0].reward = 1.0  # Set rewards on each step
traj.reward = sum(s.reward for s in traj.steps)

Installation

The SDK is included in the rllm package:
pip install rllm
For OpenTelemetry support (distributed tracing):
pip install rllm[otel]

Quick Start

Basic Usage

from rllm.sdk import session, get_chat_client

# Initialize chat client
llm = get_chat_client(api_key="sk-...")

# Track LLM calls in a session
with session(experiment="v1", task="greeting"):
    response = llm.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "Say hello"}]
    )
    print(response.choices[0].message.content)

Nested Sessions with Metadata Inheritance

Sessions can be nested, and metadata is automatically merged:
with session(experiment="v1"):
    with session(task="math"):
        # All traces get: {experiment: "v1", task: "math"}
        llm.chat.completions.create(...)

Architecture

rllm/sdk/
├── __init__.py              # Public exports
├── protocol.py              # Data models (Trace, StepView, TrajectoryView)
├── decorators.py            # @trajectory decorator
├── shortcuts.py             # session(), get_chat_client()
├── session/
│   ├── contextvar.py        # ContextVarSession (default backend)
│   ├── opentelemetry.py     # OpenTelemetrySession (W3C baggage-based)
│   ├── session_buffer.py    # SessionBuffer (ephemeral trace storage)
│   └── base.py              # SessionProtocol, wrap_with_session_context()
├── chat/
│   └── openai.py            # Tracked OpenAI chat clients
├── proxy/
│   ├── litellm_callbacks.py # TracingCallback, SamplingParametersCallback
│   ├── metadata_slug.py     # URL metadata encoding/decoding
│   └── middleware.py        # MetadataRoutingMiddleware (ASGI)
└── tracers/
    ├── memory.py            # InMemorySessionTracer
    └── sqlite.py            # SqliteTracer

Data Models

The SDK uses three primary data models:

Trace

Low-level trace from a single LLM call.
class Trace(BaseModel):
    trace_id: str
    session_name: str
    name: str
    input: LLMInput
    output: LLMOutput
    model: str
    latency_ms: float
    tokens: dict[str, int]
    metadata: dict = Field(default_factory=dict)
    timestamp: float
    parent_trace_id: str | None = None
    cost: float | None = None
    environment: str | None = None

StepView

Trace wrapper with a reward field for RL training.
class StepView(BaseModel):
    id: str                      # Trace ID
    input: Any | None = None     # LLM input
    output: Any | None = None    # LLM output
    action: Any | None = None    # Parsed action
    reward: float = 0.0          # Step reward
    metadata: dict | None = None

TrajectoryView

Collection of steps forming a complete workflow.
class TrajectoryView(BaseModel):
    name: str = "agent"
    steps: list[StepView] = Field(default_factory=list)
    reward: float = 0.0
    input: dict | None = None    # Function arguments
    output: Any = None           # Function return value
    metadata: dict | None = None

Core Functions

Session Management

from rllm.sdk import (
    session,
    get_current_session,
    get_current_session_name,
    get_current_metadata,
    get_active_session_uids,
)

# Create session with auto-generated name
session(**metadata) -> SessionContext

# Get current session (ContextVar backend only)
get_current_session() -> ContextVarSession | None

# Get session name (works with all backends)
get_current_session_name() -> str | None

# Get current metadata
get_current_metadata() -> dict

# Get active session UID chain
get_active_session_uids() -> list[str]

Chat Clients

from rllm.sdk import get_chat_client, get_chat_client_async

# Synchronous client
get_chat_client(
    provider="openai",
    use_proxy=True,
    api_key="sk-...",
    base_url="https://api.openai.com/v1",
) -> ProxyTrackedChatClient

# Async client
get_chat_client_async(
    provider="openai",
    use_proxy=True,
    **kwargs
) -> ProxyTrackedAsyncChatClient

Trajectory Decorator

from rllm.sdk import trajectory

@trajectory(name: str = "agent", **metadata)
def workflow_function(...):
    # Function body with LLM calls
    pass

Design Principles

  1. Minimal API surface: Simple, focused functions
  2. Context-based: Uses Python’s contextvars for automatic propagation
  3. Distributed-ready: OpenTelemetry backend for cross-process tracing
  4. Pluggable storage: Supports in-memory, SQLite, or custom backends
  5. Type-safe: Full type annotations with Pydantic models
  6. Async-native: First-class async/await support
  7. Proxy-integrated: Built-in support for LiteLLM proxy routing

Next Steps

Sessions

Learn about session contexts and management

Trajectories

Understand trajectory tracking and rewards

Integrations

Integrate with LangGraph, SmolAgent, and more

API Reference

Explore the complete API documentation

Build docs developers (and LLMs) love