Overview
The system tracks real-time call metrics usingLiveMetricsStore (src/apps/calls/metrics/live_store.py). Metrics are maintained per-tenant and globally.
Available Metrics
All metrics are defined inMetricStore model (src/models/metrics/store.py:22):
Metric Definitions
| Metric | Type | Description |
|---|---|---|
active_calls | Counter | Current number of active call sessions |
accepted_calls | Counter | Total calls accepted (cumulative) |
rejected_calls_capacity | Counter | Calls rejected due to capacity limits |
rejected_calls_tenant_not_configured | Counter | Calls rejected due to missing tenant config |
rejected_calls_instructions_missing | Counter | Calls rejected due to missing instructions |
instructions_db_errors | Counter | Database errors when fetching instructions |
fallback_instructions_used | Counter | Times fallback instructions were used |
started_calls | Counter | Calls that successfully started sessions |
ended_calls | Counter | Calls that completed or terminated |
failed_calls | Counter | Calls that ended with errors |
referred_calls | Counter | Calls transferred to another destination |
minutes_processed | Float | Total call minutes processed |
Accessing Metrics
Global Metrics
Get system-wide metrics across all tenants:Per-Tenant Metrics
Get metrics for a specific tenant:Metrics Implementation
LiveMetricsStore Architecture
The store maintains two data structures (live_store.py:14):Call Gates
Prevents double-counting metrics for the same call (live_store.py:8):Recording Metrics
Metrics are recorded at key points in the call lifecycle:1. Call Accepted
Location:openai_webhook.py:553
accepted_calls(tenant and global)- Sets
call_gates.accepted = True
2. Call Started
Location: Call manager when session beginsstarted_calls(tenant and global)active_calls(tenant and global)- Sets
call_gates.started = True
3. Call Rejected - Capacity
Location:openai_webhook.py:309
rejected_calls_capacity(tenant and global)- Sets
call_gates.ended = True
4. Call Rejected - Tenant Not Configured
Location:openai_webhook.py:227, 348, 465
rejected_calls_tenant_not_configured(tenant and global)
5. Call Rejected - Instructions Missing
Location:openai_webhook.py:370
rejected_calls_instructions_missing(tenant and global)
6. Instructions DB Error
Location:openai_webhook.py:392, 411, 438
instructions_db_errors(tenant and global)
ended gate (call may proceed with fallback)
7. Fallback Instructions Used
Location:openai_webhook.py:456
fallback_instructions_used(tenant and global)
8. Call Ended
Location:openai_webhook.py:649, 668
ended_calls(tenant and global)- Decrements
active_callsif call was started - Increments
failed_callsif end_reason == ERROR - Increments
referred_callsif end_reason == REFERRED - Sets
call_gates.ended = True
9. Minutes Processed
Location: Call manager after call completesminutes_processed(tenant and global)
End Reasons
Defined insrc/models/metrics/store.py:15:
Snapshot API
Get thread-safe metric snapshot (live_store.py:210):Call Gates Pruning
Old call gates are pruned to prevent memory leaks (live_store.py:220):Monitoring Best Practices
1. Track Active Calls
Monitor for capacity planning:2. Monitor Rejection Rates
High rejection rates indicate issues:3. Track Failure Rate
Monitor call quality:4. Monitor DB Health
Track database errors:5. Track Average Call Duration
Metric Export
For long-term storage and analysis, export metrics periodically:Alerting Recommendations
| Metric | Threshold | Alert |
|---|---|---|
active_calls / max_concurrent_calls | > 80% | High capacity utilization |
rejected_calls_capacity | Increasing | Capacity limits too low |
failed_calls / ended_calls | > 1% | High failure rate |
instructions_db_errors | > 0 | Database connectivity issues |
fallback_instructions_used | > 0 | Serving degraded experience |
rejected_calls_tenant_not_configured | > 0 | Tenant provisioning issues |
Thread Safety
All metric operations are protected byasyncio.Lock (live_store.py:19):
Performance Considerations
In-Memory Storage
Metrics are stored in-memory for fast access. For production:- Periodically export to persistent storage
- Prune call gates to prevent memory growth
- Monitor memory usage if tracking many tenants
Lock Contention
All metric updates acquire a lock. For high-throughput scenarios:- Keep metric operations fast (simple increments)
- Avoid blocking I/O inside lock context
- Consider batching metric updates