Overview
Langfuse is a comprehensive observability platform designed specifically for LLM applications. It captures and analyzes:- Agent Interactions: Complete traces of AI agent conversations and decision-making
- Token Analytics: Detailed token usage and cost tracking across all models
- Performance Metrics: Response times, latency, and throughput analysis
- Model Comparison: Side-by-side comparison of different LLM providers
- Error Tracking: Comprehensive error logging and debugging information
Architecture
The Langfuse stack consists of several components:- Langfuse Web: Frontend UI for visualization and analysis (port 4000)
- Langfuse Worker: Background processing for analytics and ingestion
- PostgreSQL: Primary database for metadata and traces
- ClickHouse: High-performance analytics database for metrics
- Redis: Caching and rate limiting
- MinIO: S3-compatible storage for event logs and media
Setup
Configuration
Environment Variables
Key configuration options for Langfuse:| Variable | Description | Default |
|---|---|---|
LANGFUSE_LISTEN_PORT | Web UI port | 4000 |
LANGFUSE_NEXTAUTH_URL | Public URL for authentication | http://localhost:4000 |
LANGFUSE_SALT | Salt for hashing | myglobalsalt |
LANGFUSE_ENCRYPTION_KEY | Encryption key (32 bytes hex) | Required |
LANGFUSE_TELEMETRY_ENABLED | Enable usage telemetry | false |
LANGFUSE_READ_FROM_CLICKHOUSE_ONLY | Use ClickHouse for reads | true |
LANGFUSE_ENABLE_EXPERIMENTAL_FEATURES | Enable beta features | true |
OpenTelemetry Integration
To integrate Langfuse with the observability stack, enable OTLP export:.env
Storage Configuration
Langfuse uses MinIO for S3-compatible storage:.env
Usage
Viewing Traces
- Navigate to the Traces tab in Langfuse UI
- Browse recent AI agent interactions
- Click on a trace to see detailed conversation flow
- Examine tool calls, model responses, and token usage
Analyzing Performance
- Go to the Analytics dashboard
- Review token consumption by model and agent type
- Identify slow requests and optimize accordingly
- Track costs across different LLM providers
Debugging Issues
- Use Filters to isolate problematic traces
- Search by error messages or status codes
- Review full request/response payloads
- Export traces for offline analysis
Features
LLM Tracing
Automatic capture of:- User prompts and system messages
- Model completions and reasoning
- Function/tool calls and results
- Token counts and costs
- Latency and performance metrics
Analytics Dashboard
Real-time insights including:- Token usage trends over time
- Cost analysis by model and agent
- Request volume and throughput
- Error rates and failure patterns
- Model performance comparison
Prompt Management
Version control for prompts:- Store and track prompt templates
- Compare different prompt versions
- A/B test prompt variations
- Roll back to previous versions
Dataset Evaluation
Test and validate LLM outputs:- Create evaluation datasets
- Run batch evaluations
- Track model accuracy over time
- Compare results across models
Services
Langfuse Web
The main web interface running on port 4000:docker-compose-langfuse.yml
Langfuse Worker
Background processing service:docker-compose-langfuse.yml
PostgreSQL Database
Stores traces and metadata:docker-compose-langfuse.yml
ClickHouse Database
High-performance analytics storage:docker-compose-langfuse.yml
Troubleshooting
Connection Issues
If PentAGI cannot connect to Langfuse:Performance Issues
If Langfuse UI is slow:-
Check ClickHouse is running properly:
-
Verify database migrations completed:
-
Increase resource limits in
docker-compose-langfuse.yml
Data Not Appearing
If traces are not showing up:-
Verify PentAGI configuration:
-
Check worker is processing events:
- Ensure API keys match between PentAGI and Langfuse
Best Practices
Security
- Change all default passwords immediately
- Use strong encryption keys (32 bytes minimum)
- Enable TLS/SSL for production deployments
- Restrict database access to internal networks
- Regularly rotate API keys and secrets
Performance
- Configure Redis for optimal caching
- Tune ClickHouse batch write settings
- Use connection pooling for high throughput
- Monitor disk usage for PostgreSQL and ClickHouse
- Archive old traces to object storage
Monitoring
- Set up alerts for high error rates
- Track token usage against budget limits
- Monitor database performance metrics
- Review cost trends regularly
- Audit access logs periodically
Related Documentation
- Grafana - System metrics and dashboards
- OpenTelemetry - Distributed tracing
- Observability Guide - Complete monitoring setup