Observability

Overview

Gorkie implements comprehensive observability through:

OpenTelemetry - Distributed tracing
Langfuse - AI-specific observability
Pino - Structured logging

OpenTelemetry Setup

OpenTelemetry is initialized at application startup:

server/index.ts

import { LangfuseSpanProcessor } from '@langfuse/otel';
import { NodeSDK } from '@opentelemetry/sdk-node';

const sdk = new NodeSDK({
  spanProcessors: [new LangfuseSpanProcessor()],
});

sdk.start();

Key Features:

Automatic trace propagation across async operations
Spans are exported to Langfuse for analysis
Graceful shutdown on process exit

Error Handling

Unhandled errors are captured and telemetry is flushed before exit:

server/index.ts

process.on('unhandledRejection', (reason) => {
  logger.error({ error: reason }, 'Unhandled promise rejection');
});

process.on('uncaughtException', (error) => {
  logger.error({ error }, 'Uncaught exception');
  sdk
    .shutdown()
    .catch((shutdownError: unknown) => {
      logger.error(
        { error: shutdownError },
        'Failed to shutdown telemetry after uncaught exception'
      );
    })
    .finally(() => {
      process.exit(1);
    });
});

Always ensure telemetry SDK is shut down gracefully to avoid losing traces.

Langfuse Integration

Langfuse provides AI-specific observability:

Automatic Tracing

All AI SDK operations are automatically traced:

server/lib/ai/agents/orchestrator.ts

export const orchestratorAgent = ({ context, requestHints, files, stream }) =>
  new ToolLoopAgent({
    model: provider.languageModel('chat-model'),
    // ...
    experimental_telemetry: {
      isEnabled: true,
      functionId: 'orchestrator',
    },
  });

What’s Captured:

Model calls (prompt, completion, tokens)
Tool executions (input, output, duration)
Agent reasoning steps
Error traces

Environment Variables

Configure Langfuse with these environment variables:

LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_HOST=https://cloud.langfuse.com

Viewing Traces

Navigate to Langfuse Dashboard
Select your project
View traces grouped by:
- Session - Full conversation thread
- User - Specific Slack user
- Trace - Individual message handling

Langfuse automatically groups traces by sessionId (derived from Slack thread), making it easy to debug full conversations.

Structured Logging with Pino

Gorkie uses Pino for high-performance structured logging:

server/lib/logger.ts

import pino from 'pino';

const logger = pino(
  {
    level: logLevel,
    timestamp: pino.stdTimeFunctions.isoTime,
    serializers: { err: pino.stdSerializers.err },
  },
  transport
);

export default logger;

Log Outputs

Logs are written to multiple destinations: Production:

logs/app.log - File output
stdout - Console output (for container logs)

Development:

logs/app.log - File output
pino-pretty - Pretty-printed console output

server/lib/logger.ts

const targets: TransportTargetOptions[] = [];

targets.push({
  target: 'pino/file',
  options: { destination: path.join(logDir, 'app.log') },
  level: logLevel,
});

if (isProd) {
  targets.push({
    target: 'pino/file',
    options: { destination: 1 }, // stdout
    level: logLevel,
  });
} else {
  targets.push({
    target: 'pino-pretty',
    options: {
      colorize: true,
      translateTime: 'yyyy-mm-dd HH:MM:ss.l o',
      ignore: 'pid,hostname,ctxId',
      messageFormat: '{if ctxId}[{ctxId}] {end}{msg}',
    },
    level: logLevel,
  });
}

Log Levels

Configure log level via environment variable:

LOG_LEVEL=info  # debug | info | warn | error

Structured Context

Always include relevant context in logs:

logger.info(
  { threadId, sandboxId, sessionId, template },
  'Created sandbox'
);

logger.error(
  { error, channel: channelId },
  'Failed to send message'
);

logger.debug(
  { ctxId, message: `${authorName}: ${content}` },
  `Triggered by ${trigger.type}`
);

Best Practices:

Use structured fields (objects) instead of string interpolation
Include ctxId or threadId for correlation
Add error field for exceptions (automatically serialized)
Keep messages concise and action-oriented

Example Log Output

Development (pino-pretty):

2026-03-01 10:30:45.123 +00:00 INFO [C12345-1234567890.123456]: Created sandbox
    threadId: "C12345-1234567890.123456"
    sandboxId: "sb_abc123"
    sessionId: "ses_xyz789"
    template: "gorkie-sandbox:1.1.0"

Production (JSON):

{
  "level": 30,
  "time": "2026-03-01T10:30:45.123Z",
  "msg": "Created sandbox",
  "threadId": "C12345-1234567890.123456",
  "sandboxId": "sb_abc123",
  "sessionId": "ses_xyz789",
  "template": "gorkie-sandbox:1.1.0"
}

Error Handling Patterns

Consistent error handling across the codebase:

import { toLogError } from '~/utils/error';

try {
  await riskyOperation();
} catch (error) {
  logger.error(
    { ...toLogError(error), ctxId, additionalContext },
    'Operation failed'
  );
  throw error; // Re-throw if caller should handle
}

The toLogError utility safely extracts error information:

export function toLogError(error: unknown): {
  error?: string;
  stack?: string;
} {
  if (error instanceof Error) {
    return {
      error: error.message,
      stack: error.stack,
    };
  }
  return {
    error: String(error),
  };
}

Monitoring Best Practices

1. Context Propagation

Always pass ctxId through the call stack:

const ctxId = getContextId(context);

logger.info({ ctxId }, 'Starting operation');
await performTask(ctxId);
logger.info({ ctxId }, 'Operation complete');

This enables:

Filtering logs by conversation thread
Correlating events across async operations
Debugging specific user issues

2. Trace Important Operations

Log key lifecycle events:

logger.info({ ctxId }, 'Message received');
logger.debug({ ctxId, trigger: trigger.type }, 'Triggered by mention');
logger.info({ ctxId, toolCalls: toolCalls.length }, 'AI processing complete');
logger.info({ ctxId, duration: Date.now() - start }, 'Response sent');

3. Monitor Resource Usage

Log sandbox lifecycle for cost monitoring:

logger.info(
  { threadId, sandboxId, action: 'created' },
  'Sandbox lifecycle event'
);
logger.info(
  { threadId, sandboxId, action: 'paused' },
  'Sandbox lifecycle event'
);
logger.info(
  { threadId, sandboxId, action: 'deleted', reason: 'expired' },
  'Sandbox lifecycle event'
);

4. Alert on Critical Errors

Set up alerts for:

Unhandled exceptions
Sandbox creation failures
Database connection errors
Rate limit exhaustion
E2B API errors

Debugging Tips

Find All Logs for a Thread

grep 'C12345-1234567890.123456' logs/app.log | jq

Filter by Log Level

cat logs/app.log | jq 'select(.level >= 50)' # Errors only

Track Sandbox Lifecycle

cat logs/app.log | jq 'select(.sandboxId == "sb_abc123")'

Monitor Tool Execution

cat logs/app.log | jq 'select(.msg | contains("tool"))'

Performance Metrics

Key metrics to track in production:

Metric	Description	Target
Response Time	Time from message to reply	< 5s
Sandbox Creation	Time to create new sandbox	< 30s
Sandbox Resume	Time to resume paused sandbox	< 5s
Tool Execution	Time per tool call	< 3s
Memory Usage	Application memory footprint	< 512MB
Active Sandboxes	Number of running sandboxes	< 50

Use Langfuse’s analytics dashboard to track AI-specific metrics like token usage, cost per conversation, and tool success rates.

Troubleshooting

Missing Traces

If traces aren’t appearing in Langfuse:

Verify LANGFUSE_PUBLIC_KEY and LANGFUSE_SECRET_KEY are set
Check SDK is properly initialized before first AI call
Ensure SDK shutdown is called on exit
Look for SDK errors in logs

Log File Size

If log files grow too large:

Implement log rotation (use pino-roll or external tool)
Lower LOG_LEVEL to warn or error
Filter verbose libraries (Slack SDK, Drizzle)
Set up external log aggregation (CloudWatch, Datadog)

Context Missing

If logs are missing ctxId:

Ensure getContextId(context) is called early
Pass ctxId to all child functions
Add ctxId to child logger: logger.child({ ctxId })

Get Started

Configuration

Core Features

AI Tools

Advanced

Overview

OpenTelemetry Setup

Error Handling

Langfuse Integration

Automatic Tracing

Environment Variables

Viewing Traces

Structured Logging with Pino

Log Outputs

Log Levels

Structured Context

Example Log Output

Error Handling Patterns

Monitoring Best Practices

1. Context Propagation

2. Trace Important Operations

3. Monitor Resource Usage

4. Alert on Critical Errors

Debugging Tips

Find All Logs for a Thread

Filter by Log Level

Track Sandbox Lifecycle

Monitor Tool Execution

Performance Metrics

Troubleshooting

Missing Traces

Log File Size

Context Missing

Build docs developers (and LLMs) love

Get Started

Configuration

Core Features

AI Tools

Advanced

​Overview

​OpenTelemetry Setup

​Error Handling

​Langfuse Integration

​Automatic Tracing

​Environment Variables

​Viewing Traces

​Structured Logging with Pino

​Log Outputs

​Log Levels

​Structured Context

​Example Log Output

​Error Handling Patterns

​Monitoring Best Practices

​1. Context Propagation

​2. Trace Important Operations

​3. Monitor Resource Usage

​4. Alert on Critical Errors

​Debugging Tips

​Find All Logs for a Thread

​Filter by Log Level

​Track Sandbox Lifecycle

​Monitor Tool Execution

​Performance Metrics

​Troubleshooting

​Missing Traces

​Log File Size

​Context Missing

Build docs developers (and LLMs) love

Overview

OpenTelemetry Setup

Error Handling

Langfuse Integration

Automatic Tracing

Environment Variables

Viewing Traces

Structured Logging with Pino

Log Outputs

Log Levels

Structured Context

Example Log Output

Error Handling Patterns

Monitoring Best Practices

1. Context Propagation

2. Trace Important Operations

3. Monitor Resource Usage

4. Alert on Critical Errors

Debugging Tips

Find All Logs for a Thread

Filter by Log Level

Track Sandbox Lifecycle

Monitor Tool Execution

Performance Metrics

Troubleshooting

Missing Traces

Log File Size

Context Missing