Observability Overview

Helicone provides a complete observability platform for LLM applications, giving you deep insights into how your AI systems perform in production. From individual request debugging to complex multi-step agent workflows, Helicone captures every detail you need to build reliable AI applications.

Why Observability Matters

LLM applications are inherently complex and non-deterministic. Without proper observability, you’re flying blind:

Debug production issues - Understand why specific requests failed or produced unexpected results
Track agent workflows - Visualize multi-step AI agent processes from start to finish
Monitor performance - Identify bottlenecks, latency issues, and cost spikes
Analyze user behavior - See how users interact with your AI features
Optimize costs - Track spending by user, feature, or workflow to control expenses

Core Observability Features

Requests

View and query every LLM request with full request/response bodies, metadata, and performance metrics

Sessions

Group related requests into sessions to trace complete AI agent workflows and multi-turn conversations

Traces

Log custom traces for non-LLM operations like database queries, API calls, and tool executions

Metrics

Analyze aggregate metrics across requests, sessions, and users to understand system-wide performance

Custom Properties

Tag requests with metadata for filtering, segmentation, and cost analysis by any dimension

User Metrics

Track per-user costs, usage patterns, and engagement metrics

How It Works

Helicone’s observability system operates at multiple levels:

Request Level

Every LLM request flows through Helicone, capturing:

Complete request and response bodies
Token counts and cost calculations
Latency and time-to-first-token metrics
Provider, model, and status information
Custom properties and metadata

Session Level

Related requests are grouped using session headers:

Track multi-step agent workflows
Visualize parent-child request relationships
Analyze session-level metrics (total cost, duration, request count)
Debug complex conversation flows

Custom Traces

Log non-LLM operations to get complete visibility:

Database queries and vector searches
API calls and external tool executions
Custom business logic and data processing
Any operation you want to track within your workflow

Quick Start

Integrate Helicone

Route your LLM requests through Helicone’s gateway or use the SDK:

import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://ai-gateway.helicone.ai",
  apiKey: process.env.HELICONE_API_KEY,
});

const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Hello!" }]
});

Add Custom Properties

Tag requests with metadata for filtering and analysis:

const response = await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Hello!" }]
  },
  {
    headers: {
      "Helicone-Property-Environment": "production",
      "Helicone-Property-Feature": "chat",
      "Helicone-User-Id": "user-123"
    }
  }
);

Track Sessions (Optional)

For multi-step workflows, add session headers:

import { randomUUID } from "crypto";

const sessionId = randomUUID();

// First request in workflow
await client.chat.completions.create(
  { /* ... */ },
  {
    headers: {
      "Helicone-Session-Id": sessionId,
      "Helicone-Session-Path": "/analyze",
      "Helicone-Session-Name": "Document Analyzer"
    }
  }
);

// Follow-up request in same session
await client.chat.completions.create(
  { /* ... */ },
  {
    headers: {
      "Helicone-Session-Id": sessionId,
      "Helicone-Session-Path": "/analyze/summary",
      "Helicone-Session-Name": "Document Analyzer"
    }
  }
);

View in Dashboard

Visit helicone.ai/requests to see your data:

Filter by properties, model, date range, or status
Click into individual requests to see full details
View sessions to trace multi-step workflows
Analyze metrics and trends over time

Common Observability Patterns

Debugging Production Issues

// Add rich metadata to aid debugging
const response = await client.chat.completions.create(
  { /* request */ },
  {
    headers: {
      "Helicone-Request-Id": "custom-request-id-123",
      "Helicone-Property-Environment": "production",
      "Helicone-Property-Version": "v2.1.0",
      "Helicone-Property-RequestType": "user_query",
      "Helicone-User-Id": "user-abc-123"
    }
  }
);

// Later, query by request ID or properties to find the issue

Tracking Agent Workflows

const sessionId = randomUUID();

// Research phase
await client.chat.completions.create(
  { messages: [/* research query */] },
  {
    headers: {
      "Helicone-Session-Id": sessionId,
      "Helicone-Session-Path": "/research",
      "Helicone-Session-Name": "Research Agent"
    }
  }
);

// Analysis phase
await client.chat.completions.create(
  { messages: [/* analysis prompt */] },
  {
    headers: {
      "Helicone-Session-Id": sessionId,
      "Helicone-Session-Path": "/research/analysis",
      "Helicone-Session-Name": "Research Agent"
    }
  }
);

// View complete workflow in Sessions page

Cost Monitoring by Feature

// Tag all requests by feature
const headers = {
  "Helicone-Property-Feature": featureName,
  "Helicone-Property-UserTier": userTier,
  "Helicone-User-Id": userId
};

// Filter by properties in dashboard to see:
// - Cost per feature
// - Cost per user tier
// - Cost per individual user

Data Retention

Helicone stores your observability data based on your plan:

Free Plan: 30 days of request data
Pro Plan: 90 days of request data
Enterprise Plan: Custom retention (up to unlimited)

Request bodies can be optionally excluded for sensitive data:

{
  headers: {
    "Helicone-Omit-Request": "true",  // Exclude request body
    "Helicone-Omit-Response": "true"  // Exclude response body
  }
}

Querying Your Data

Access observability data programmatically via REST API:

Query Requests

Filter and export request data for analysis

Query Sessions

Retrieve session data with all related requests

User Metrics

Analyze per-user usage and costs

Custom Exports

Export large datasets using our CLI tool

Advanced Features

Real-time Monitoring

Live request feed in the dashboard
Webhook notifications for specific events
Alerts for errors, rate limits, or cost thresholds

Performance Analysis

Latency percentiles (p50, p95, p99)
Time-to-first-token tracking
Request rate and throughput metrics
Error rate monitoring by provider and model

Cost Optimization

Cost breakdowns by model, user, and feature
Token usage analysis and optimization suggestions
Budget alerts and spending limits
Cache hit rate tracking for cost savings

Privacy & Security

Helicone takes data privacy seriously:

Encryption: All data encrypted in transit (TLS) and at rest
Isolation: Each organization’s data is isolated
Access Control: Role-based access control for teams
Compliance: SOC 2 Type II compliant, GDPR ready
Data Residency: EU region available for GDPR compliance

You can exclude sensitive data from logs:

{
  headers: {
    "Helicone-Omit-Request": "true",
    "Helicone-Omit-Response": "true"
  }
}

Next Steps

View Requests

Explore the Requests page and learn how to query your data

Track Sessions

Group related requests into sessions for workflow tracking

Add Custom Properties

Tag requests with metadata for filtering and analysis

Analyze Metrics

Understand your system’s performance with aggregate metrics

Questions?

Need help or have questions? We’re here to help:

Discord Community: Join our Discord server for quick help
GitHub Issues: Report bugs or request features on GitHub
Documentation: Check our full documentation for more guides

Get Started

AI Gateway

Observability

Prompt Management

Evaluation & Testing

Features

Self-Hosting

Integrations

​Why Observability Matters

​Core Observability Features

Requests

Sessions

Traces

Metrics

Custom Properties

User Metrics

​How It Works

​Request Level

​Session Level

​Custom Traces

​Quick Start

​Common Observability Patterns

​Debugging Production Issues

​Tracking Agent Workflows

​Cost Monitoring by Feature

​Data Retention

​Querying Your Data

Query Requests

Query Sessions

User Metrics

Custom Exports

​Advanced Features

​Real-time Monitoring

​Performance Analysis

​Cost Optimization

​Privacy & Security

​Next Steps

View Requests

Track Sessions

Add Custom Properties

Analyze Metrics

​Questions?

Build docs developers (and LLMs) love

Why Observability Matters

Core Observability Features

How It Works

Request Level

Session Level

Custom Traces

Quick Start

Common Observability Patterns

Debugging Production Issues

Tracking Agent Workflows

Cost Monitoring by Feature

Data Retention

Querying Your Data

Advanced Features

Real-time Monitoring

Performance Analysis

Cost Optimization

Privacy & Security

Next Steps

Questions?