Skip to main content

Overview

ADK-TS includes comprehensive OpenTelemetry integration for observing agent behavior, tracking performance, and debugging issues in production. The telemetry system captures traces, metrics, and events across the entire agent lifecycle.

Quick Start

import { telemetryService } from '@iqai/adk';

// Initialize with OTLP endpoint
await telemetryService.initialize({
  appName: 'my-agent-app',
  appVersion: '1.0.0',
  otlpEndpoint: 'http://localhost:4318/v1/traces',
  enableTracing: true,
  enableMetrics: true,
});

// Your agent code here
const agent = new AgentBuilder()
  .withModel('gpt-4')
  .buildLlm();

await agent.ask('Hello!');

// Shutdown gracefully
await telemetryService.shutdown();

Telemetry Configuration

interface TelemetryConfig {
  // Application identity
  appName: string;              // Service name
  appVersion?: string;          // Service version
  environment?: string;         // 'development' | 'production' | 'staging'
  
  // OTLP exporter
  otlpEndpoint?: string;        // OTLP endpoint URL
  otlpHeaders?: Record<string, string>; // Custom headers
  
  // Feature flags
  enableTracing?: boolean;      // Enable distributed tracing
  enableMetrics?: boolean;      // Enable metrics collection
  enableAutoInstrumentation?: boolean; // Auto-instrument HTTP, etc.
  
  // Sampling and performance
  samplingRatio?: number;       // 0.0 to 1.0 (default: 1.0)
  metricExportIntervalMs?: number; // Metric export interval
  
  // Resource attributes
  resourceAttributes?: Record<string, any>;
  
  // Debug mode
  debug?: boolean;              // Enable in-memory span collection
}
Source: packages/adk/src/telemetry/types.ts

Example Configurations

await telemetryService.initialize({
  appName: 'my-agent',
  appVersion: '0.1.0',
  environment: 'development',
  enableTracing: true,
  enableMetrics: true,
  debug: true, // Enable in-memory debugging
});

Tracing

The tracing system automatically captures agent operations:

Automatic Traces

ADK-TS automatically traces: LLM Calls: Model requests/responses with token counts ✅ Tool Executions: Function calls with arguments and results ✅ Agent Invocations: Complete agent runs ✅ Agent Transfers: Multi-agent handoffs ✅ Memory Operations: Search and insert operations ✅ Plugin Hooks: Plugin lifecycle events Source: packages/adk/src/telemetry/tracing.ts:27

Trace Attributes

All traces include standard OpenTelemetry GenAI semantic conventions:
// LLM traces include:
{
  'gen_ai.provider.name': 'openai',
  'gen_ai.operation.name': 'chat',
  'gen_ai.request.model': 'gpt-4',
  'gen_ai.response.model': 'gpt-4',
  'gen_ai.usage.input_tokens': 150,
  'gen_ai.usage.output_tokens': 75,
  'gen_ai.response.finish_reasons': ['stop'],
  
  // ADK-specific attributes
  'adk.session_id': 'session_abc123',
  'adk.user_id': 'user_456',
  'adk.agent_name': 'MyAgent',
  'adk.environment': 'production',
}
Source: packages/adk/src/telemetry/tracing.ts:220

Custom Spans

import { telemetryService } from '@iqai/adk';

// Wrap async operations
await telemetryService.withSpan(
  'custom_operation',
  async (span) => {
    // Your code here
    span.setAttribute('custom.attribute', 'value');
    
    const result = await performOperation();
    
    span.addEvent('operation_completed', {
      result_count: result.length,
    });
    
    return result;
  },
  {
    // Initial attributes
    'operation.type': 'data_processing',
  }
);

Async Generator Tracing

// Trace streaming operations
async function* processStream() {
  for await (const item of dataStream) {
    yield item;
  }
}

const tracedStream = telemetryService.traceAsyncGenerator(
  'process_stream',
  processStream(),
  { stream_type: 'data' }
);

for await (const item of tracedStream) {
  console.log(item);
}
Source: packages/adk/src/telemetry/tracing.ts:373

Metrics

The metrics system tracks quantitative data:

Automatic Metrics

LLM Token Usage: Input/output tokens by model ✅ LLM Call Count: Total LLM invocations ✅ LLM Duration: Request latency ✅ Tool Call Count: Function execution frequency ✅ Error Count: Failures by category Source: packages/adk/src/telemetry/metrics.ts

Recording Custom Metrics

import { telemetryService } from '@iqai/adk';

// Record LLM tokens
telemetryService.recordLlmTokens(
  promptTokens: 100,
  completionTokens: 50,
  {
    model: 'gpt-4',
    agentName: 'MyAgent',
    environment: 'production',
    status: 'success',
  }
);

// Record LLM call
telemetryService.recordLlmCall({
  model: 'gpt-4',
  agentName: 'MyAgent',
  environment: 'production',
  status: 'success',
});

// Record LLM duration
telemetryService.recordLlmDuration(1500, { // milliseconds
  model: 'gpt-4',
  agentName: 'MyAgent',
  status: 'success',
});

// Record tool call
telemetryService.recordToolCall(
  toolName: 'web_search',
  {
    agentName: 'SearchAgent',
    status: 'success',
  }
);

// Record errors
telemetryService.recordError(
  category: 'llm',
  errorType: 'rate_limit_exceeded',
);

Metric Dimensions

All metrics support dimensions for filtering and aggregation:
interface MetricAttributes {
  model?: string;           // LLM model name
  agentName?: string;       // Agent name
  environment?: string;     // Environment
  status?: 'success' | 'error';
  toolName?: string;        // Tool name
  errorType?: string;       // Error type
}

Content Capture

Control whether to capture full request/response content:
// Enable content capture (default: disabled)
process.env.ADK_TELEMETRY_CAPTURE_CONTENT = 'true';

await telemetryService.initialize({ ... });

// When enabled, traces include:
// - Full LLM prompts
// - Complete LLM responses
// - Tool arguments
// - Tool results
Privacy Warning: Content capture may log sensitive data. Only enable in non-production environments or ensure data is properly sanitized.

OpenTelemetry Backends

Jaeger (Development)

# Start Jaeger
docker run -d --name jaeger \
  -p 16686:16686 \
  -p 4318:4318 \
  jaegertracing/all-in-one:latest
await telemetryService.initialize({
  appName: 'my-agent',
  otlpEndpoint: 'http://localhost:4318/v1/traces',
  enableTracing: true,
});
View traces at: http://localhost:16686

Honeycomb

await telemetryService.initialize({
  appName: 'my-agent',
  otlpEndpoint: 'https://api.honeycomb.io/v1/traces',
  otlpHeaders: {
    'x-honeycomb-team': process.env.HONEYCOMB_API_KEY!,
    'x-honeycomb-dataset': 'my-agent-dataset',
  },
  enableTracing: true,
  enableMetrics: true,
});

Datadog

// Requires Datadog Agent with OTLP enabled
await telemetryService.initialize({
  appName: 'my-agent',
  otlpEndpoint: 'http://localhost:4318/v1/traces',
  resourceAttributes: {
    'service.namespace': 'ai-agents',
  },
  enableTracing: true,
});

New Relic

await telemetryService.initialize({
  appName: 'my-agent',
  otlpEndpoint: 'https://otlp.nr-data.net:4318/v1/traces',
  otlpHeaders: {
    'api-key': process.env.NEW_RELIC_LICENSE_KEY!,
  },
  enableTracing: true,
});

Grafana Cloud

await telemetryService.initialize({
  appName: 'my-agent',
  otlpEndpoint: 'https://otlp-gateway-prod-us-central-0.grafana.net/otlp/v1/traces',
  otlpHeaders: {
    'Authorization': `Basic ${Buffer.from(
      `${process.env.GRAFANA_INSTANCE_ID}:${process.env.GRAFANA_API_KEY}`
    ).toString('base64')}`,
  },
  enableTracing: true,
  enableMetrics: true,
});

Debugging with In-Memory Exporter

import { telemetryService } from '@iqai/adk';

await telemetryService.initialize({
  appName: 'my-agent',
  debug: true, // Enables in-memory span collection
});

// Run your agent
await agent.ask('Hello');

// Inspect captured spans
const exporter = telemetryService.getInMemoryExporter();
const spans = exporter.getFinishedSpans();

for (const span of spans) {
  console.log('Span:', span.name);
  console.log('Attributes:', span.attributes);
  console.log('Duration:', span.duration);
}

// Clear spans
exporter.reset();
Source: packages/adk/src/telemetry/in-memory-exporter.ts

Advanced Tracing Patterns

Error Tracing

import { telemetryService } from '@iqai/adk';

try {
  await riskyOperation();
} catch (error) {
  telemetryService.traceError(
    error as Error,
    'tool_error',
    true,  // recoverable
    true,  // retry recommended
  );
  
  throw error;
}
Source: packages/adk/src/telemetry/tracing.ts:598

Memory Operations

telemetryService.traceMemoryOperation(
  'search',
  sessionId: 'session_123',
  query: 'user preferences',
  resultsCount: 5,
  invocationContext,
);
Source: packages/adk/src/telemetry/tracing.ts:633

Agent Transfers

telemetryService.traceAgentTransfer(
  sourceAgent: 'MainAgent',
  targetAgent: 'SpecialistAgent',
  transferChain: ['RootAgent', 'MainAgent'],
  transferDepth: 2,
  reason: 'Requires specialized knowledge',
  invocationContext,
);
Source: packages/adk/src/telemetry/tracing.ts:507

Sampling Strategies

Control trace volume with sampling:
// Sample 10% of traces
await telemetryService.initialize({
  appName: 'my-agent',
  samplingRatio: 0.1,
});

// Sample based on conditions
class SmartSampler {
  shouldSample(context: InvocationContext): boolean {
    // Always sample errors
    if (context.hasError) {
      return true;
    }
    
    // Always sample slow requests
    if (context.durationMs > 5000) {
      return true;
    }
    
    // Sample 1% of normal requests
    return Math.random() < 0.01;
  }
}

Best Practices

  1. Initialize Early: Set up telemetry before creating agents
  2. Use Structured Attributes: Add meaningful dimensions to traces/metrics
  3. Sample Appropriately: Balance observability with cost
  4. Handle Shutdown: Always call telemetryService.shutdown() on exit
  5. Secure Credentials: Never log API keys or tokens
  6. Monitor Performance: Watch for telemetry overhead
  7. Test Locally: Use Jaeger for development
Telemetry adds minimal overhead (less than 1% in production). The debug mode with in-memory export has higher overhead and should only be used in development.

Graceful Shutdown

import { telemetryService } from '@iqai/adk';

process.on('SIGTERM', async () => {
  console.log('Shutting down...');
  
  // Flush pending telemetry
  await telemetryService.flush();
  
  // Shutdown telemetry
  await telemetryService.shutdown();
  
  process.exit(0);
});
Source: packages/adk/src/telemetry/setup.ts:340

Next Steps

Flows & Processors

Build custom request/response processors

Examples

See telemetry in action

Build docs developers (and LLMs) love