Skip to main content

Connection Issues

Symptom

Tests fail to start or multiple threads fail immediately with connection timeout errors.

Root Cause

The ConnectionTimeout setting (default: 15 seconds in QueryStressSettings.cs:119) controls how long the client waits to establish a connection. When using many threads, SQL Server may not accept connections fast enough.

Solution

  1. Increase ConnectionTimeout:
    • Recommended: 30-60 seconds for high thread counts (>50 threads)
    • The setting applies to all connections in the test
  2. Verify SQL Server connection limits:
    SELECT @@MAX_CONNECTIONS -- Default is usually 32767
    
  3. Check connection pooling is enabled:
    • Verify EnableConnectionPooling = true (default in QueryStressSettings.cs:121)
    • Pooling is essential for multi-threaded testing
  4. Reduce initial thread count:
    • Start with fewer threads and gradually increase
    • LoadEngine pre-allocates MinPoolSize = threads (LoadEngine.cs:47)

Prevention

The LoadEngine initializes the connection pool before starting threads (LoadEngine.cs:112-117) to avoid allocation delays during testing.

Symptom

“Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool.”

Root Cause

The connection pool is configured with MinPoolSize = MaxPoolSize = threads (LoadEngine.cs:46-48). If connections aren’t being properly closed, the pool can be exhausted.

Solution

  1. Verify proper connection handling:
    • LoadEngine uses using statements (LoadEngine.cs:382) to ensure disposal
    • Connections are explicitly closed after each iteration (LoadEngine.cs:456)
  2. Check for blocking queries:
    • Long-running queries hold connections longer
    • Set appropriate CommandTimeout to prevent indefinite holds (default: 0 = infinite)
  3. Clear connection pools:
    • LoadEngine calls ClearAllPools() on cancellation (LoadEngine.cs:194)
    • Restart the application if pools are corrupted

Symptom

Intermittent severe errors, connection dropped, or attention errors during execution.

Root Cause

Most commonly caused by query timeouts or resource exhaustion on SQL Server.

Solution

  1. Set reasonable CommandTimeout:
    • Default is 0 (infinite), which can cause issues
    • Recommended: 30-120 seconds depending on expected query duration
  2. Enable KillQueriesOnCancel:
    • Default: true (QueryStressSettings.cs:125)
    • Ensures queries are cancelled when you stop the test
    • Uses timer to call SqlCommand.Cancel() (LoadEngine.cs:322-340)
  3. Check SQL Server resources:
    • Memory pressure
    • TempDB contention
    • Lock escalation

Thread and Cancellation Issues

Symptom

Clicking “Cancel” doesn’t stop the test immediately, or threads hang indefinitely.

Root Cause

Threads are running queries that don’t respond to cancellation tokens.

KillQueriesOnCancel Setting

This setting (QueryStressSettings.cs:54, default: true) controls aggressive query cancellation.When enabled (recommended):
  • A timer checks every 2 seconds if cancellation is pending (LoadEngine.cs:324)
  • Calls SqlCommand.Cancel() to request SQL Server terminate the query (LoadEngine.cs:334)
  • Threads are given 5 seconds to exit gracefully (LoadEngine.cs:190)
  • After 5 seconds, threads are forcefully interrupted (LoadEngine.cs:191)
When disabled:
  • Threads wait for queries to complete naturally
  • Can take a long time if queries are slow or blocked

Solution

  1. Ensure KillQueriesOnCancel is enabled (default)
  2. Wait up to 5 seconds for graceful thread termination
  3. If threads still hang: Check for SQL Server blocking
    -- Find blocking queries
    SELECT * FROM sys.dm_exec_requests WHERE blocking_session_id > 0
    
Even with KillQueriesOnCancel enabled, SQL Server may take time to roll back transactions. Long-running uncommitted transactions cannot be instantly cancelled.

Symptom

When using DelayBetweenQueries, tests take a long time to cancel.

Root Cause

The delay uses Task.Delay() which must respect the cancellation token (LoadEngine.cs:491).

How It Works

if (_queryDelay > 0)
    Task.Delay(_queryDelay, ctsToken).Wait();
The delay will be interrupted when cancellation is requested, wrapping the operation in proper exception handling (LoadEngine.cs:493-506).

Solution

This is working as designed. The cancellation token properly interrupts delays, so you should see responsive cancellation even with long delays.If experiencing issues:
  1. Ensure you’re using a recent version of SQL Query Stress
  2. Check that KillQueriesOnCancel = true

Query Execution Errors

Symptom

Your query executes successfully in SQL Server Management Studio but fails during stress testing.

Root Cause

SQL Query Stress applies SSMS-compatible connection settings (documented in README.md:33-43) that may expose issues not visible in isolation.

Common Issues

  1. Connection settings differences:
    • SQL Query Stress sets ARITHABORT ON (matches SSMS)
    • Sets QUOTED_IDENTIFIER ON
    • Sets ANSI_NULLS ON These settings can affect query plans and behavior.
  2. Timeout differences:
    • CommandTimeout default is 0 (infinite) in SQL Query Stress
    • SSMS default is typically 0 as well, but can be configured differently
  3. Connection context:
    • SQL Query Stress sets CurrentLanguage = "English" (LoadEngine.cs:49)
    • Date formats and error messages use English

Solution

  1. Review querysettings.sql in the application directory
  2. Test query with same SET options in SSMS:
    SET QUOTED_IDENTIFIER ON;
    SET ANSI_NULL_DFLT_ON ON;
    SET ANSI_PADDING ON;
    SET ANSI_WARNINGS ON;
    SET ANSI_NULLS ON;
    SET ARITHABORT ON;
    SET CONCAT_NULL_YIELDS_NULL ON;
    
    -- Your query here
    
  3. Check query plan differences using SET STATISTICS PROFILE ON

Symptom

  • Random deadlock errors (Error 1205)
  • Some iterations succeed, others fail
  • Results vary between test runs

Root Cause

Concurrent execution creating lock contention or race conditions.

Solution

  1. Review query isolation level:
    • Default is READ COMMITTED
    • Consider if your test should use snapshot isolation:
    SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
    
  2. Analyze deadlock patterns:
    • Enable trace flag 1222 for deadlock details
    • Check SQL Server error log
    • Identify conflicting resources
  3. Reduce concurrency temporarily:
    • Lower thread count to isolate issue
    • If deadlocks disappear with fewer threads, this confirms lock contention
  4. Add query hints if appropriate:
    • WITH (NOLOCK) for read-only queries (be aware of dirty reads)
    • WITH (READPAST) to skip locked rows
    • WITH (ROWLOCK) to prevent lock escalation
Deadlocks during stress testing may indicate real production issues. Don’t just suppress them—understand and fix the underlying cause.

Statistics and Results Issues

Symptom

The statistics columns remain at 0 or blank during test execution.

Root Cause

Statistics collection is disabled or the InfoMessage event handler isn’t capturing output.

How Statistics Work

When enabled (LoadEngine.cs:139-144):
  1. SET STATISTICS IO ON and/or SET STATISTICS TIME ON is executed
  2. SQL Server sends statistics as InfoMessage events
  3. The InfoMessage handler parses messages using regex (LoadEngine.cs:278-287)
  4. Extracted values populate QueryOutput (LoadEngine.cs:352, 361-362)

Solution

  1. Enable statistics collection:
    • CollectIoStats = true (default in QueryStressSettings.cs:122)
    • CollectTimeStats = true (default in QueryStressSettings.cs:123)
  2. Verify query produces statistics:
    • Some queries (like SELECT @@VERSION) don’t generate IO statistics
    • DDL statements may not produce expected statistics
  3. Check for query errors:
    • If the query fails, statistics aren’t collected
    • Review the Exceptions column in results
  4. Test statistics manually in SSMS:
    SET STATISTICS IO ON;
    SET STATISTICS TIME ON;
    -- Your query
    

Symptom

Reported execution times don’t match expectations or SQL Server metrics.

What Gets Measured

The LoadEngine uses a Stopwatch (LoadEngine.cs:295) that starts immediately before query execution and stops after:
  • ForceDataRetrieval = false: After ExecuteNonQuery() returns (LoadEngine.cs:432)
  • ForceDataRetrieval = true: After all rows are read from all result sets (LoadEngine.cs:418-428)
This measures client-perceived time, including:
  • Query execution on server
  • Network transfer
  • Client-side data processing

Solution

  1. Compare with SQL Server statistics:
    • Enable CollectTimeStats to see server-side elapsed time
    • Client time will always be >= server time due to network overhead
  2. Consider ForceDataRetrieval setting:
    • If true, time includes reading all rows
    • Large result sets will show significantly longer times
  3. Check for network latency:
    • Run test from same machine as SQL Server
    • Use SELECT @@SERVERNAME to verify server
  4. Account for statistics collection overhead:
    • Parsing InfoMessages adds minimal time (<1% typically)
    • Disable to verify if you suspect overhead

Getting Help

If you’re still experiencing issues:
  1. Check GitHub issues: SQLQueryStress Issues
  2. Review source code: LoadEngine.cs contains the complete execution logic
  3. Submit detailed bug report: Include query, settings, error messages, and SQL Server version

Build docs developers (and LLMs) love