Skip to main content
This guide will help you instrument your first AI application with OpenInference and visualize traces in Phoenix.

Prerequisites

Choose your language:
  • Python 3.9 or higher
  • An OpenAI API key (or another LLM provider)

Installation

1

Install OpenInference instrumentation

Install the OpenInference instrumentation library for your framework:
pip install openinference-instrumentation-openai "openai>=1.26" arize-phoenix opentelemetry-sdk opentelemetry-exporter-otlp
Replace openinference-instrumentation-openai with the instrumentation for your framework:
  • LangChain: openinference-instrumentation-langchain
  • LlamaIndex: openinference-instrumentation-llama-index
  • Anthropic: openinference-instrumentation-anthropic
  • See all Python instrumentations
2

Start Phoenix server

Phoenix is an open-source AI observability platform that runs entirely on your machine. Start the server to collect traces:
python -m phoenix.server.main serve
Phoenix will start on http://localhost:6006. Open this URL in your browser.
The Phoenix server does not send data over the internet — all traces stay on your machine.
3

Set your API key

Set your OpenAI API key as an environment variable:
export OPENAI_API_KEY="your-api-key-here"
4

Instrument your application

Create a file with the following code to instrument your first LLM call:
Create app.py:
app.py
import openai
from openinference.instrumentation.openai import OpenAIInstrumentor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk import trace as trace_sdk
from opentelemetry.sdk.trace.export import SimpleSpanProcessor

# Configure OpenTelemetry to send traces to Phoenix
endpoint = "http://127.0.0.1:6006/v1/traces"
tracer_provider = trace_sdk.TracerProvider()
tracer_provider.add_span_processor(SimpleSpanProcessor(OTLPSpanExporter(endpoint)))

# Instrument OpenAI
OpenAIInstrumentor().instrument(tracer_provider=tracer_provider)

if __name__ == "__main__":
    # Make an OpenAI call - it will be automatically traced
    client = openai.OpenAI()
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": "Write a haiku about observability."}],
        max_tokens=50,
    )
    print(response.choices[0].message.content)
Run the application:
python app.py
5

View traces in Phoenix

Open Phoenix in your browser at http://localhost:6006. You’ll see:
  • Traces view: All LLM calls with timing, token counts, and costs
  • Span details: Input messages, output messages, model parameters
  • Timeline: Visual representation of the execution flow
Phoenix trace details

What’s captured?

OpenInference automatically captures:

Messages

Full conversation history including system prompts, user messages, and assistant responses

Token counts

Prompt tokens, completion tokens, cached tokens, and reasoning tokens

Model parameters

Temperature, max tokens, top-p, and other invocation parameters

Costs

Estimated costs for prompt and completion tokens in USD

Timing

Start time, end time, and duration with nanosecond precision

Errors

Exception messages and stack traces when calls fail

Advanced example with context

Add session tracking, user IDs, and custom metadata to your traces:
from openinference.instrumentation import using_attributes

with using_attributes(
    session_id="user-session-123",
    user_id="user-456",
    metadata={
        "environment": "production",
        "version": "1.0.0",
    },
    tags=["chat", "customer-support"],
):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": "How do I reset my password?"}],
    )

Streaming example

OpenInference supports streaming LLM responses:
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Write a story about AI."}],
    stream=True,
    stream_options={"include_usage": True},  # Required for token counts
)

for chunk in response:
    if chunk.choices and (content := chunk.choices[0].delta.content):
        print(content, end="")
Set stream_options={"include_usage": True} to capture token counts when streaming (requires openai>=1.26).

Next steps

Python instrumentations

Explore all 30+ Python instrumentation libraries

JavaScript instrumentations

Explore JavaScript/TypeScript instrumentations

Privacy controls

Configure data masking and PII protection

Concepts

Learn about traces, spans, and attributes

Build docs developers (and LLMs) love