OpenInference Overview

OpenInference is a semantic convention specification for AI application observability, built on OpenTelemetry. It standardizes how LLM calls, agent reasoning steps, tool invocations, retrieval operations, and other AI-specific workloads are represented as distributed traces.

Why OpenInference?

OpenTelemetry defines a universal wire format and SDK model for distributed tracing, but its attribute model is intentionally generic. AI applications present a distinct set of observability requirements that general-purpose conventions do not address:

Structured inputs and outputs

LLM calls carry multi-turn message arrays, system prompts, tool definitions, and multimodal content. A single string input.value is insufficient for capturing the complexity of AI application data flows.

Token economics

Prompt and completion token counts, along with cached and reasoning token breakdowns, are first-class operational metrics for AI applications, not afterthoughts.

Agentic control flow

Modern AI systems route through reasoning loops, delegate to sub-agents, invoke tools, and query retrieval systems. Each hop needs a consistent identity and span-kind taxonomy for the trace to be interpretable.

Privacy sensitivity

Prompts and completions frequently contain personal information and must be maskable before export, with per-field granularity.

Nondeterminism

LLM outputs are stochastic. Traces must carry enough context to reproduce—or at least explain—a particular execution.

How It Works

OpenInference solves these problems by defining a concrete attribute schema and span-kind taxonomy on top of OpenTelemetry spans. Every OpenInference trace is a valid OTLP trace—the conventions give attribute names their AI-specific meaning.

OpenInference extends OpenTelemetry without modifying it. Any system that can ingest OTLP traces can receive OpenInference traces.

Core Concepts

OpenInference is built on three fundamental concepts:

Traces

A trace records the full execution path of a request—from the user’s initial input through every LLM call, tool invocation, and retrieval step to the final response. Traces are trees of spans connected by parent–child relationships. The root span typically represents an agent turn or pipeline invocation; child spans represent individual operations within it.

Learn about traces

Understand trace structure, hierarchy, and context propagation

Spans

A span is the atomic unit of work: one LLM call, one tool execution, one retrieval query, one embedding generation. Every span carries structured metadata including timestamps, status, and typed attributes.

Learn about spans

Explore span anatomy, timestamps, status codes, and events

Attributes

Attributes are typed key-value pairs attached to spans following a structured naming convention. They are the primary payload of OpenInference: they carry the prompt, the response, the model name, the retrieved documents, the tool arguments, and everything else needed to understand and reproduce a given execution.

Learn about attributes

Understand attribute naming conventions and semantic conventions

Span Kinds

The openinference.span.kind attribute classifies what an operation does, enabling observability platforms to render traces with AI-aware visualizations and aggregations.

LLM

A call to a language model API (OpenAI, Anthropic, etc.)

AGENT

A reasoning step in an autonomous agent

CHAIN

A sequence of operations or orchestration logic

TOOL

Execution of a function or external API

RETRIEVER

A query to a vector store or search engine

RERANKER

Reordering documents by relevance

EMBEDDING

Generation of vector embeddings

GUARDRAIL

Input or output moderation check

EVALUATOR

Automated evaluation of model responses

PROMPT

Prompt template rendering

Explore span kinds

Detailed documentation for all span kinds

Compliance

The keywords “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in the OpenInference specification are to be interpreted as described in BCP 14 [RFC2119] [RFC8174].

An implementation is compliant if it satisfies all “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, and “SHALL NOT” requirements defined in the specification.

Next Steps

Traces

Learn how traces represent request execution paths

Spans

Understand the atomic units of work

Span Kinds

Explore all available span kinds

Attributes

Master attribute naming conventions

Get Started

Concepts

Python

JavaScript

Java

Configuration

Why OpenInference?

How It Works

Core Concepts

Traces

Learn about traces

Spans

Learn about spans

Attributes

Learn about attributes

Span Kinds

LLM

AGENT

CHAIN

TOOL

RETRIEVER

RERANKER

EMBEDDING

GUARDRAIL

EVALUATOR

PROMPT

Explore span kinds

Compliance

Next Steps

Traces

Spans

Span Kinds

Attributes

Build docs developers (and LLMs) love

Get Started

Concepts

Python

JavaScript

Java

Configuration

​Why OpenInference?

​How It Works

​Core Concepts

​Traces

Learn about traces

​Spans

Learn about spans

​Attributes

Learn about attributes

​Span Kinds

LLM

AGENT

CHAIN

TOOL

RETRIEVER

RERANKER

EMBEDDING

GUARDRAIL

EVALUATOR

PROMPT

Explore span kinds

​Compliance

​Next Steps

Traces

Spans

Span Kinds

Attributes

Build docs developers (and LLMs) love

Why OpenInference?

How It Works

Core Concepts

Traces

Spans

Attributes

Span Kinds

Compliance

Next Steps