Skip to main content
Export your LLM request and response data from Helicone for analysis, backup, compliance, or migration to other systems.

Why Export Data

Common use cases:
  • Fine-tuning preparation: Export production data as training examples
  • Custom analytics: Analyze in your own BI tools (Tableau, PowerBI)
  • Compliance: Meet data retention and audit requirements
  • Backup: Keep local copies of critical data
  • Migration: Move data between systems or regions

Export Methods

Helicone provides three ways to export data:

NPM Tool

Command-line tool with resume support

REST API

Programmatic access for automation

Dashboard

Manual export via UI
The easiest and most reliable way to export large datasets.

Quick Start

# No installation required - use npx
HELICONE_API_KEY="sk-xxx" npx @helicone/export \
  --start-date 2024-01-01 \
  --end-date 2024-12-31 \
  --limit 10000 \
  --include-body

Features

Auto-Recovery

Resumes from last checkpoint if interrupted

Retry Logic

Exponential backoff for transient failures

Progress Tracking

Real-time progress with ETA

Multiple Formats

JSON, JSONL, or CSV output

Common Usage Examples

Export all requests from a date range:
HELICONE_API_KEY="sk-xxx" npx @helicone/export \
  --start-date 2024-01-01 \
  --end-date 2024-12-31 \
  --format jsonl \
  --output ./data/helicone-export.jsonl \
  --include-body
Output:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Helicone Data Export Tool            ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

Fetching total count...
Total records: 45,231
Exporting to: ./data/helicone-export.jsonl

Progress: [====================] 100% | 45,231/45,231 | ETA: 0s

✅ Export complete!
├── Records exported: 45,231
├── Output file: ./data/helicone-export.jsonl
├── File size: 1.2 GB
└── Duration: 3m 42s

Configuration Options

OptionDescriptionDefaultExample
--start-dateStart date (ISO 8601)30 days ago2024-01-01
--end-dateEnd date (ISO 8601)Now2024-12-31
--limitMax records to exportUnlimited10000
--formatOutput formatjsonljson, jsonl, csv
--outputOutput file pathhelicone-export.*./data/export.jsonl
--include-bodyInclude request/response bodiesfalse(flag)
--propertyFilter by propertyNoneEnvironment=prod
--regionAPI regionusus, eu
--batch-sizeRecords per API call1000500
--resumeResume from checkpointfalse(flag)
--clean-stateClear checkpoint and restartfalse(flag)
--log-levelLogging verbositynormalquiet, verbose

Method 2: REST API

For programmatic export and automation.

Basic Query

import fs from 'fs';

const HELICONE_API_KEY = process.env.HELICONE_API_KEY;

async function exportData(
  startDate: string,
  endDate: string,
  limit: number = 1000
) {
  const response = await fetch(
    "https://api.helicone.ai/v1/request/query-clickhouse",
    {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${HELICONE_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        filter: {
          request_response_rmt: {
            request_created_at: {
              gte: startDate,
              lte: endDate,
            },
          },
        },
        limit,
      }),
    }
  );
  
  const data = await response.json();
  return data.data;
}

// Export and save
const requests = await exportData(
  "2024-01-01T00:00:00Z",
  "2024-12-31T23:59:59Z",
  10000
);

fs.writeFileSync(
  "export.jsonl",
  requests.map(r => JSON.stringify(r)).join("\n")
);

console.log(`Exported ${requests.length} requests`);

Advanced Filtering

{
  "filter": {
    "request_response_rmt": {
      "properties": {
        "Environment": { "equals": "production" },
        "Feature": { "equals": "chat" }
      },
      "request_created_at": {
        "gte": "2024-01-01T00:00:00Z"
      }
    }
  },
  "limit": 1000
}

Pagination for Large Exports

async function exportAllData(
  startDate: string,
  endDate: string
) {
  const allRequests = [];
  let offset = 0;
  const batchSize = 1000;
  
  while (true) {
    console.log(`Fetching batch at offset ${offset}...`);
    
    const response = await fetch(
      "https://api.helicone.ai/v1/request/query-clickhouse",
      {
        method: "POST",
        headers: {
          "Authorization": `Bearer ${HELICONE_API_KEY}`,
          "Content-Type": "application/json",
        },
        body: JSON.stringify({
          filter: {
            request_response_rmt: {
              request_created_at: {
                gte: startDate,
                lte: endDate,
              },
            },
          },
          limit: batchSize,
          offset,
        }),
      }
    );
    
    const data = await response.json();
    const batch = data.data;
    
    if (batch.length === 0) {
      break; // No more data
    }
    
    allRequests.push(...batch);
    offset += batch.length;
    
    console.log(`Total fetched: ${allRequests.length}`);
    
    // Respect rate limits
    await new Promise(resolve => setTimeout(resolve, 100));
  }
  
  return allRequests;
}

// Usage
const allData = await exportAllData(
  "2024-01-01T00:00:00Z",
  "2024-12-31T23:59:59Z"
);

console.log(`Exported ${allData.length} total requests`);

Method 3: Dashboard Export

Manual export for small datasets.
1

Navigate to Requests

2

Apply Filters

Filter data to export:
  • Date range
  • Properties (Environment, Feature, etc.)
  • User ID
  • Model
  • Status
3

Export

Click “Export” button and choose format:
  • JSON
  • CSV
Dashboard export is limited to 10,000 records. For larger datasets, use the NPM tool or API.

Data Format

One JSON object per line:
{"request_id":"req_abc123","created_at":"2024-01-15T10:30:00Z","model":"gpt-4o","prompt_tokens":50,"completion_tokens":100,"cost_usd":0.015}
{"request_id":"req_def456","created_at":"2024-01-15T10:31:00Z","model":"gpt-4o-mini","prompt_tokens":30,"completion_tokens":80,"cost_usd":0.003}
Benefits:
  • Streamable (process line by line)
  • Efficient for large files
  • Easy to split/merge

JSON Format

Array of objects:
[
  {
    "request_id": "req_abc123",
    "created_at": "2024-01-15T10:30:00Z",
    "model": "gpt-4o",
    "prompt_tokens": 50,
    "completion_tokens": 100,
    "cost_usd": 0.015
  },
  {
    "request_id": "req_def456",
    "created_at": "2024-01-15T10:31:00Z",
    "model": "gpt-4o-mini",
    "prompt_tokens": 30,
    "completion_tokens": 80,
    "cost_usd": 0.003
  }
]

CSV Format

Comma-separated values:
request_id,created_at,model,prompt_tokens,completion_tokens,cost_usd
req_abc123,2024-01-15T10:30:00Z,gpt-4o,50,100,0.015
req_def456,2024-01-15T10:31:00Z,gpt-4o-mini,30,80,0.003
Best for:
  • Excel/Google Sheets
  • BI tools (Tableau, PowerBI)
  • Simple analysis

Included Fields

FieldDescriptionType
request_idUnique request identifierstring
created_atTimestamp (ISO 8601)string
user_idUser identifierstring
modelModel namestring
prompt_tokensInput tokensnumber
completion_tokensOutput tokensnumber
total_tokensTotal tokensnumber
cost_usdCost in USDnumber
latencyResponse time (ms)number
statusHTTP status codenumber
propertiesCustom propertiesobject
request_bodyRequest payload (if --include-body)object
response_bodyResponse payload (if --include-body)object

Use Case Examples

Fine-Tuning Dataset

Export successful requests for training:
HELICONE_API_KEY="sk-xxx" npx @helicone/export \
  --property Task=sentiment-analysis \
  --property Environment=production \
  --start-date 2024-01-01 \
  --format jsonl \
  --include-body \
  --output training-data.jsonl

# Post-process to OpenAI format
node convert-to-openai-format.js training-data.jsonl

Cost Analysis

Export for custom analytics:
HELICONE_API_KEY="sk-xxx" npx @helicone/export \
  --start-date 2024-01-01 \
  --end-date 2024-12-31 \
  --format csv \
  --output costs-2024.csv

# Import into Excel/Tableau for analysis

Compliance Backup

Monthly backup for audit trail:
#!/bin/bash
# backup-monthly.sh

MONTH=$(date -d "last month" +%Y-%m)
START_DATE="${MONTH}-01T00:00:00Z"
END_DATE=$(date -d "${START_DATE} +1 month" +%Y-%m-%dT00:00:00Z)

HELICONE_API_KEY="sk-xxx" npx @helicone/export \
  --start-date "$START_DATE" \
  --end-date "$END_DATE" \
  --format jsonl \
  --include-body \
  --output "backups/helicone-${MONTH}.jsonl.gz"

echo "Backup complete for $MONTH"

User Data Export (GDPR)

Export all data for a specific user:
const response = await fetch(
  "https://api.helicone.ai/v1/request/query-clickhouse",
  {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${HELICONE_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      filter: {
        request_response_rmt: {
          user_id: { equals: "user-123" },
        },
      },
      limit: 100000,
    }),
  }
);

const userData = await response.json();

// Save for GDPR request
fs.writeFileSync(
  "user-123-data-export.json",
  JSON.stringify(userData.data, null, 2)
);

Best Practices

Use JSONL for large exports: More efficient than JSON arrays
Export incrementally: Daily or weekly exports are easier to manage than one large export
Compress backups: JSONL compresses well with gzip (80-90% reduction)
Filter early: Apply filters at export time to reduce data size
Request bodies can be large: Only use --include-body when needed

Troubleshooting

Tips to speed up:
  • Use --batch-size 500 for faster but smaller batches
  • Apply filters to reduce data volume
  • Export during off-peak hours
  • Check your network connection
Use --resume to continue:
HELICONE_API_KEY="sk-xxx" npx @helicone/export --resume
Or clean state and restart:
HELICONE_API_KEY="sk-xxx" npx @helicone/export --clean-state ...
Reduce batch size:
HELICONE_API_KEY="sk-xxx" npx @helicone/export \
  --batch-size 250 \
  ...
Or add delays in custom scripts:
await new Promise(resolve => setTimeout(resolve, 500));
Ensure property name matches exactly:
# Correct
--property Environment=production

# Wrong (case sensitive)
--property environment=production
Check property exists in your data:
  1. Go to Helicone dashboard
  2. View a request
  3. Check exact property names

Automated Exports

Schedule regular exports:

Cron Job (Linux/Mac)

# Add to crontab (crontab -e)
# Run daily at 2 AM
0 2 * * * cd /path/to/project && HELICONE_API_KEY=sk-xxx npx @helicone/export --start-date $(date -d "yesterday" +\%Y-\%m-\%d) --output backups/daily-$(date +\%Y-\%m-\%d).jsonl

GitHub Actions

name: Daily Helicone Backup

on:
  schedule:
    - cron: '0 2 * * *'  # Daily at 2 AM UTC

jobs:
  export:
    runs-on: ubuntu-latest
    steps:
      - name: Export Helicone data
        env:
          HELICONE_API_KEY: ${{ secrets.HELICONE_API_KEY }}
        run: |
          npx @helicone/export \
            --start-date $(date -d "yesterday" +%Y-%m-%d) \
            --format jsonl \
            --output backup-$(date +%Y-%m-%d).jsonl
      
      - name: Upload to S3
        uses: aws-actions/aws-cli@v2
        with:
          args: s3 cp backup-$(date +%Y-%m-%d).jsonl s3://my-backups/helicone/

Next Steps

Query API Docs

Full API documentation for queries

Fine-Tuning Prep

Use exported data for fine-tuning

Custom Properties

Add metadata for better filtering

Sessions

Export complete workflows

Build docs developers (and LLMs) love