Skip to main content

Overview

Helicone automatically logs every LLM request that flows through the platform, capturing detailed information about inputs, outputs, costs, latency, and metadata. View, filter, and analyze all requests in your dashboard.

What Data is Captured

For each request, Helicone captures:

Request Data

  • Prompt/Input: Complete request body including messages, parameters, and system prompts
  • Timestamp: When the request was initiated
  • Model: Which LLM model was used (e.g., gpt-4, claude-3-opus)
  • Provider: The LLM provider (OpenAI, Anthropic, etc.)
  • User ID: Optional user identifier for user-level analytics
  • Properties: Custom metadata added via headers

Response Data

  • Completion/Output: Full response from the LLM
  • Status: Success (200), error (4xx/5xx), or rate limited
  • Tokens: Prompt tokens, completion tokens, and total tokens
  • Cost: Calculated cost in USD based on model pricing
  • Latency: Total request duration and time to first token

Metadata

  • Request ID: Unique identifier for the request
  • Session ID: Optional session grouping identifier
  • Trace ID: For distributed tracing across multiple requests
  • Cache Status: Whether the response was cached
  • Model Version: Specific model version used

Viewing Requests

Access your requests in the Helicone dashboard:
1

Navigate to Requests

Go to helicone.ai/requests in your dashboard
2

View Request List

See all requests in a table with key metrics: timestamp, model, cost, latency, status
3

Click a Request

Click any request to see full details including input/output, metadata, and metrics

Request Details Page

Click any request to see comprehensive details:
  • Request and response bodies (formatted JSON)
  • Model and provider information
  • Cost breakdown
  • Latency metrics
  • Status and error messages

Filtering Requests

Use powerful filters to find specific requests:

Filter Options

// Example filter query via API
const response = await fetch('https://api.helicone.ai/v1/request/query', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    filter: {
      left: {
        request_response_rmt: {
          model: {
            equals: "gpt-4"
          }
        }
      },
      operator: "and",
      right: {
        request_response_rmt: {
          status: {
            equals: 200
          }
        }
      }
    },
    limit: 100,
    offset: 0,
    sort: {
      created_at: "desc"
    }
  })
});

Common Filters

Filter by specific models like gpt-4, claude-3-opus, or gpt-3.5-turbo
Filter by success (200), client errors (4xx), or server errors (5xx)
Filter requests by user ID to see a specific user’s activity
Filter expensive requests above a certain cost threshold
Find slow requests exceeding a latency threshold
Filter by any custom property you’ve added to requests
Filter requests within a specific time range

Logging Requests

Requests are logged automatically when you:

Option 1: Use the Proxy

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_OPENAI_KEY",
    base_url="https://oai.helicone.ai/v1",  # Use Helicone proxy
    default_headers={
        "Helicone-Auth": "Bearer YOUR_HELICONE_KEY"
    }
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)
# Request is automatically logged

Option 2: Add Headers to Direct API Calls

import requests

response = requests.post(
    "https://api.openai.com/v1/chat/completions",
    headers={
        "Authorization": "Bearer YOUR_OPENAI_KEY",
        "Helicone-Auth": "Bearer YOUR_HELICONE_KEY",
        "Content-Type": "application/json"
    },
    json={
        "model": "gpt-4",
        "messages": [{"role": "user", "content": "Hello!"}]
    }
)
# Request is logged by adding Helicone-Auth header

Option 3: Async Logging

// For requests that don't go through Helicone proxy
await fetch('https://api.helicone.ai/v1/trace/custom/log', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_HELICONE_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    providerRequest: {
      url: "https://api.openai.com/v1/chat/completions",
      json: {
        model: "gpt-4",
        messages: [{ role: "user", content: "Hello!" }]
      },
      meta: { Helicone-Auth: "Bearer YOUR_HELICONE_KEY" }
    },
    providerResponse: {
      json: response,
      status: 200,
      headers: {}
    },
    timing: {
      startTime: { seconds: startTime, nanos: 0 },
      endTime: { seconds: endTime, nanos: 0 }
    }
  })
});

Request API

Query Requests

Fetch requests with filters and pagination

Get Request by ID

Retrieve a specific request’s details

Add Feedback

Rate requests with thumbs up/down

Add Properties

Add custom properties to existing requests

Advanced Features

Request Body Storage

By default, Helicone stores complete request and response bodies. You can:
  • Omit request bodies: Add Helicone-Omit-Request: true header
  • Omit response bodies: Add Helicone-Omit-Response: true header
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Sensitive data"}],
    extra_headers={
        "Helicone-Omit-Request": "true"  # Don't store request body
    }
)

Feedback & Ratings

Add thumbs up/down ratings to requests:
await fetch(`https://api.helicone.ai/v1/request/${requestId}/feedback`, {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    rating: true  // true = thumbs up, false = thumbs down
  })
});

Next Steps

Group into Sessions

Track multi-turn conversations

Add Custom Properties

Enrich requests with metadata

Trace Workflows

Visualize complex request flows

Track Users

Monitor per-user analytics

Build docs developers (and LLMs) love