Overview
When tracing is enabled for a query, the cluster records:- Which coordinator handled the query
- Query execution parameters
- Events that occurred during execution
- Which nodes were contacted
- Timing information
- Thread/shard information
system_traces.sessions and system_traces.events tables.
Enabling Tracing
Enable tracing on a per-statement basis:For Prepared Statements
For Batch Statements
Retrieving Tracing Information
Get the tracing ID from query results:TracingInfo Structure
The driver provides structured tracing information:Examining Tracing Data
Basic Information
Query Parameters
Events
Examine events that occurred during execution:TracingEvent Structure
Example Event Activities
- “Execute CQL3 query”
- “Parsing a statement [shard 1]”
- “Sending a mutation to /127.0.0.1 [shard 1]”
- “Request complete”
- “Computing ranges to query”
- “Submitting range requests on N ranges”
Analyzing Node Involvement
Complete Example
Timing Considerations
Tracing data is written asynchronously:Retention
Tracing data has a default TTL:- ScyllaDB: 24 hours
- Cassandra: 24 hours
system_traces keyspace settings.
Use Cases
Debugging Slow Queries
Verifying Query Routing
Consistency Level Verification
Performance Impact
Tracing has overhead:- Cluster writes tracing data to
system_tracestables - Additional latency (typically 1-5ms)
- Storage overhead
- CPU overhead for generating traces
Best Practices
- Enable tracing only when debugging specific issues
- Wait 100-200ms before retrieving tracing data
- Disable tracing in production unless investigating issues
- Use tracing to verify:
- Token-aware routing is working
- Expected consistency levels
- Node involvement in queries
- Query performance bottlenecks
- Consider using query history for production monitoring
- Store tracing IDs for queries you want to investigate later
Limitations
- Tracing adds latency and overhead
- Tracing data is eventually consistent
- May not capture all details for very fast queries
- Retention period is limited (default 24h)
- Cannot trace internal driver operations
Tracing vs. History
| Feature | Tracing | History |
|---|---|---|
| Granularity | Cluster events | Driver events |
| Location | Server-side | Client-side |
| Overhead | Moderate | Minimal |
| Coverage | Query execution | Retries, speculative |
| Storage | system_traces | In-memory |
| Retention | 24h default | Until cleared |
- Tracing: What happened in the cluster
- History: What the driver did
Next Steps
- Metrics - Aggregate performance statistics
- Speculative Execution - Reduce tail latencies
