Best Practices

When to Use SQL Query Stress

Ideal Use Cases

SQL Query Stress is purpose-built for testing individual queries and stored procedures under load, not full application testing.

Use SQL Query Stress for:

Query optimization validation: Test if your index/query changes actually improve performance
Concurrency testing: Understand how queries behave with multiple simultaneous executions
Resource impact analysis: Measure CPU, IO, and memory effects of queries under load
Regression testing: Verify that database changes don’t degrade performance
Capacity planning: Determine maximum throughput for specific operations
Stored procedure stress testing: Test procedures with realistic parameter distributions

Don’t use SQL Query Stress for:

Full application load testing (use JMeter, LoadRunner, etc.)
Testing web services or APIs directly
UI/UX performance testing
Multi-query transaction workflows (each test runs one query repeatedly)

Complementary Tools

SQL Query Stress works best alongside:

SQL Server Profiler/Extended Events: Capture broader database activity
Query Store: Track query performance over time
Database Engine Tuning Advisor: Get index recommendations
sp_BlitzCache: Analyze execution plan patterns

Setting Realistic Thread and Iteration Counts

Understanding Thread Behavior

Each thread in SQL Query Stress (LoadEngine.cs:120-158):

Opens a dedicated SQL connection from the pool
Executes the query for N iterations sequentially
Optionally delays between iterations
Runs at BelowNormal thread priority (LoadEngine.cs:152)

Threads ≠ Concurrent Executions at All Times: Due to the sequential nature of iterations within each thread, actual concurrency fluctuates. Faster queries mean more overlap.

Thread Count Guidelines

Start with your expected concurrent users:

OLTP applications: 20-100 threads per application server
Reporting systems: 5-20 threads (reports typically run sequentially)
Web applications: 10-50 threads (depends on connection pooling)
Batch processes: 1-10 threads (often single-threaded)

Progressive load testing approach:

Baseline: 1 thread, 100 iterations (establishes single-user performance)
Light load: 10 threads, 100 iterations each (1,000 total executions)
Normal load: Match expected production concurrency
Stress test: 2-3x normal load (find breaking points)
Maximum throughput: Increase threads until performance plateaus

The LoadEngine automatically sets MinPoolSize = MaxPoolSize = threads (LoadEngine.cs:47-48) to pre-allocate connections. This means thread count directly impacts initial connection overhead.

Iteration Count Guidelines

Total executions = threads × iterations Choose iterations based on test goals: Quick validation (10-50 iterations):

Verify query executes without errors
Get rough performance estimates
Fast feedback during development

Performance baselines (100-1000 iterations):

Establish reliable average execution times
Smooth out variance and outliers
Recommended minimum for comparative testing

Statistical significance (1000+ iterations):

Detailed percentile analysis (p50, p95, p99)
Identify rare edge cases
Production-like performance characterization

Endurance testing (10,000+ iterations):

Detect memory leaks
Identify performance degradation over time
Test plan cache stability

More iterations = more statistical confidence but longer test runtime. Balance thoroughness with iteration time.

Test Environment Setup

Environment Considerations

Ideal test environment:

Matches production hardware specifications
Isolated from other workloads
Contains production-like data volumes
Same SQL Server version and edition
Identical indexes and statistics

Common environment issues:

Testing on underpowered hardware

Problem: Your test server has 4 CPUs and 16GB RAM, production has 32 CPUs and 256GB RAM.Impact:

CPU-bound queries will appear much slower
Memory pressure may force excessive disk IO
Parallelism behavior differs dramatically

Solution:

Scale thread counts proportionally (reduce by CPU ratio)
Focus on relative improvements, not absolute numbers
Test on production-equivalent hardware for capacity planning
Use Resource Governor to limit CPU/memory if testing on production

Empty or unrealistic test data

Problem: Testing with 1,000 rows when production has 10 million rows.Impact:

Entire table fits in memory (won’t in production)
Query plans differ dramatically (index scans vs seeks)
Locking behavior is unrealistic
Results are completely misleading

Solution:

Load representative data volumes
Include realistic data distributions (not just sequential IDs)
Update statistics: UPDATE STATISTICS TableName WITH FULLSCAN
Consider using production backup restored to test environment

Stale or missing statistics

Problem: You restored a production backup but statistics are outdated.Impact:

Query optimizer makes poor decisions
Execution plans don’t match production
Performance results are invalid

Solution:

-- Update all statistics with full scan
EXEC sp_MSforeachtable 'UPDATE STATISTICS ? WITH FULLSCAN'

-- Verify statistics are current
SELECT 
    OBJECT_NAME(object_id) AS TableName,
    name AS StatisticName,
    STATS_DATE(object_id, stats_id) AS LastUpdated
FROM sys.stats
WHERE OBJECTPROPERTY(object_id, 'IsUserTable') = 1
ORDER BY LastUpdated;

Network Considerations

Run SQL Query Stress from the same network segment as your application servers, not your development machine. Network latency significantly impacts results.

Network impact on timing:

LoadEngine measures client-perceived time (LoadEngine.cs:413-435)
Includes query execution + network transfer
ForceDataRetrieval = true amplifies network effects for large result sets

Baseline Before Optimization

Why Baselines Matter

Without a baseline, you can’t prove optimization worked. Baseline workflow:

Document current state:
- Current query or procedure code
- Existing indexes
- Current statistics collection state
Run baseline test:
- Enable CollectIoStats = true (QueryStressSettings.cs:122)
- Enable CollectTimeStats = true (QueryStressSettings.cs:123)
- Record: avg/min/max execution time, logical reads, CPU time
- Save the test configuration
Make ONE change:
- Add an index, OR
- Rewrite the query, OR
- Adjust a parameter
- Never multiple changes simultaneously
Re-run identical test:
- Same thread count
- Same iteration count
- Same connection settings
- Same test data
Compare results:
- Did execution time improve?
- Did logical reads decrease?
- Did CPU time decrease?

Change one variable at a time. If you add an index AND rewrite the query, you won’t know which change helped (or if one change hurt).

Key Metrics to Baseline

Always track these QueryOutput metrics (LoadEngine.cs:535-545):

Time (TimeSpan): Client-perceived execution time
- Watch: Average, P95, P99, maximum
LogicalReads (int): Pages read from buffer cache
- Goal: Minimize (fewer IOs = faster queries)
- Extracted via regex from InfoMessage events (LoadEngine.cs:280)
CpuTime (int): SQL Server CPU milliseconds
- Distinguishes CPU-bound from IO-bound queries
- Extracted via regex from InfoMessage events (LoadEngine.cs:284-286)
ElapsedTime (int): Server-side elapsed time
- Compare to client Time to identify network overhead
Exception count: Should be zero in successful tests
- Any exceptions indicate problems

Interpreting Results Correctly

Understanding the Output

SQL Query Stress aggregates results from all threads via a BlockingCollection<QueryOutput> (LoadEngine.cs:18) that:

Collects output from each iteration of each thread
Reports progress to the UI (LoadEngine.cs:203)
Tracks active threads (LoadEngine.cs:202)

Common Interpretation Mistakes

Focusing only on average time

Why this is wrong:

Averages hide outliers that affect user experience
A few very slow executions get hidden by many fast ones
P95 and P99 metrics matter more for user satisfaction

Better approach:

Look at maximum execution time (worst case)
Calculate percentiles if your UI supports it
Investigate why some iterations are much slower

Ignoring logical reads

Why this matters:

Logical reads = pages read from buffer cache
Direct indicator of IO efficiency
Reducing logical reads almost always improves performance

Example:

Query A: 50ms average, 10,000 logical reads
Query B: 55ms average, 1,000 logical reads
Query B is better: Lower IO means better scalability under load

When IO matters most:

Data not in cache (cold cache scenario)
High concurrency (cache contention)
Large databases (cache can’t hold everything)

Not accounting for first execution

Issue: The first iteration may be slower due to:

Plan compilation (LoadEngine caches SqlCommand objects, but plans may recompile)
Cold buffer cache
Statistics loading

Solution:

Run a warm-up test first
Use high iteration counts (100+) so first execution is insignificant
Or manually exclude first iteration from analysis

Testing only ideal conditions

Real production has:

Stale statistics
Fragmented indexes
Other concurrent workload
Plan cache pressure
TempDB contention

Better testing:

Test with realistic concurrent load (use multiple test scenarios)
Occasionally clear plan cache between tests
Test with statistics out of date
Run alongside other database activity

CPU-Bound vs IO-Bound Analysis

Compare CpuTime vs ElapsedTime from statistics: CPU-bound query:

CPU time << elapsed time
Solution: Optimize query logic, reduce computations, improve indexes to reduce rows processed

IO-bound query:

CPU time << elapsed time
Solution: Reduce logical reads, add covering indexes, increase memory

Wait-bound query:

Both CPU and elapsed are low, but client Time is high
Solution: Network issues, locking/blocking, check for waits in Extended Events

Comparing Test Runs

Test Configuration Consistency

For valid comparisons, keep identical:

Save your test configurations! SQL Query Stress can save/load settings (QueryStressSettings.cs is serializable) to ensure consistency.

Connection string (same server, database)
Thread count (affects concurrency and connection pool)
Iteration count (affects statistical confidence)
CommandTimeout (QueryStressSettings.cs:30)
ForceDataRetrieval (QueryStressSettings.cs:48)
CollectIoStats / CollectTimeStats (QueryStressSettings.cs:18, 24)
DelayBetweenQueries (QueryStressSettings.cs:84)
Parameter query and mappings (if using parameters)

Making Valid Comparisons

Good comparison:

Test A: Original query, 50 threads, 200 iterations
        Avg: 125ms, Logical reads: 5000
        
Test B: Optimized query, 50 threads, 200 iterations (same settings)
        Avg: 45ms, Logical reads: 150
        
Conclusion: 64% faster, 97% fewer IOs - optimization successful

Invalid comparison:

Test A: 10 threads, 100 iterations
        Avg: 125ms
        
Test B: 50 threads, 500 iterations (different settings!)
        Avg: 45ms
        
Conclusion: Can't determine if optimization worked or if lower average 
            is due to caching from more iterations

Statistical Significance

Use high iteration counts (500+) when comparing close results. A 5ms difference in average time is meaningless with only 10 iterations but significant with 1000.

When results are close:

Run multiple test cycles (3-5 complete runs of each configuration)
Calculate standard deviation
Look for consistent patterns across runs
Small differences (<5%) may be noise

Advanced Testing Strategies

Progressive Load Testing

thread  → Baseline single-user performance
threads → Light load, check for obvious issues
threads → Moderate load, realistic workload
threads → Heavy load, stress testing
threads → Maximum load, find breaking point

Plot threads vs throughput to find optimal concurrency.

Parameter Distribution Testing

When using the parameter query feature (LoadEngine.cs:210-272): Design parameter queries to reflect production:

Include same distribution of values (not just SELECT TOP 100)
Include edge cases (NULLs, empty strings, large values)
Randomize order to avoid cache effects

Example:

-- Bad: Sequential IDs (unrealistic cache behavior)
SELECT TOP 1000 CustomerID FROM Customers ORDER BY CustomerID

-- Good: Random sample matching production distribution  
SELECT TOP 1000 CustomerID 
FROM Customers 
TABLESAMPLE(1000 ROWS)
ORDER BY NEWID()

The ParamServer rotates through rows using Interlocked.Increment (LoadEngine.cs:223), cycling when reaching the end.

Endurance Testing

Test for degradation over time: Configuration:

Low thread count (5-10)
Very high iteration count (10,000-100,000)
Monitor for:
- Increasing execution times
- Growing memory usage
- Plan cache bloat
- TempDB growth

Cancellation Testing

Test your queries handle cancellation gracefully by stopping tests mid-run with KillQueriesOnCancel = true (default).

The LoadEngine gives threads 5 seconds to exit gracefully (LoadEngine.cs:190) before forcing interruption. Queries should respond to cancellation within this window.

Common Pitfalls to Avoid

Testing SELECT * instead of actual column list: Column list affects IO and network transfer
Forgetting to update statistics after data changes: Results will be misleading
Testing with empty TempDB: Production TempDB is usually busy with other operations
Not clearing plan cache between major changes: May test cached plan, not new query
Using COUNT(*) to validate results: Adds overhead that skews timing
Testing stored procedures without parameters: Real usage always includes parameters
Assuming faster = better: Sometimes slower query with fewer reads is more scalable
Not monitoring SQL Server during tests: Use Perfmon/DMVs to see resource utilization

Get Started

Guides

Configuration

Advanced

Reference

When to Use SQL Query Stress

Ideal Use Cases

Complementary Tools

Setting Realistic Thread and Iteration Counts

Understanding Thread Behavior

Thread Count Guidelines

Iteration Count Guidelines

Test Environment Setup

Environment Considerations

Network Considerations

Baseline Before Optimization

Why Baselines Matter

Key Metrics to Baseline

Interpreting Results Correctly

Understanding the Output

Common Interpretation Mistakes

CPU-Bound vs IO-Bound Analysis

Comparing Test Runs

Test Configuration Consistency

Making Valid Comparisons

Statistical Significance

Advanced Testing Strategies

Progressive Load Testing

Parameter Distribution Testing

Endurance Testing

Cancellation Testing

Common Pitfalls to Avoid

Further Reading

Build docs developers (and LLMs) love

Get Started

Guides

Configuration

Advanced

Reference

​When to Use SQL Query Stress

​Ideal Use Cases

​Complementary Tools

​Setting Realistic Thread and Iteration Counts

​Understanding Thread Behavior

​Thread Count Guidelines

​Iteration Count Guidelines

​Test Environment Setup

​Environment Considerations

​Network Considerations

​Baseline Before Optimization

​Why Baselines Matter

​Key Metrics to Baseline

​Interpreting Results Correctly

​Understanding the Output

​Common Interpretation Mistakes

​CPU-Bound vs IO-Bound Analysis

​Comparing Test Runs

​Test Configuration Consistency

​Making Valid Comparisons

​Statistical Significance

​Advanced Testing Strategies

​Progressive Load Testing

​Parameter Distribution Testing

​Endurance Testing

​Cancellation Testing

​Common Pitfalls to Avoid

​Further Reading

Build docs developers (and LLMs) love

When to Use SQL Query Stress

Ideal Use Cases

Complementary Tools

Setting Realistic Thread and Iteration Counts

Understanding Thread Behavior

Thread Count Guidelines

Iteration Count Guidelines

Test Environment Setup

Environment Considerations

Network Considerations

Baseline Before Optimization

Why Baselines Matter

Key Metrics to Baseline

Interpreting Results Correctly

Understanding the Output

Common Interpretation Mistakes

CPU-Bound vs IO-Bound Analysis

Comparing Test Runs

Test Configuration Consistency

Making Valid Comparisons

Statistical Significance

Advanced Testing Strategies

Progressive Load Testing

Parameter Distribution Testing

Endurance Testing

Cancellation Testing

Common Pitfalls to Avoid

Further Reading