Load Testing

The WispHub API includes a comprehensive load testing suite using Locust to validate concurrency thresholds, ensure caching decorators behave correctly, and measure performance under realistic load.

Why Load Testing?

Conversational bots create unique load patterns:

Burst traffic: Multiple users interact simultaneously
Repeated queries: Same endpoints called frequently
High read ratio: Mostly GET requests for client lookups
Low latency requirements: Users expect instant responses

Load testing validates:

Cache effectiveness: Verify LRU cache reduces backend load
Concurrency handling: Ensure async workers handle simultaneous requests
Performance benchmarks: Measure response times under load
Failure modes: Identify breaking points before production

Locust Overview

Locust is a modern, Python-based load testing tool that:

Simulates concurrent users
Provides real-time web UI
Supports distributed testing
Uses Python code (not config files)

requirements-dev.txt

locust==2.43.3

Load Test Configuration

The test suite is defined in locustfile.py:

locustfile.py

from locust import HttpUser, task, between

class WispHubAPIUser(HttpUser):
    # Wait between 1 and 3 seconds between tasks
    wait_time = between(1, 3)

    @task(3)
    def get_clients(self):
        """Simulate fetching the list of clients"""
        self.client.get("/api/v1/clients/")

    @task(2)
    def search_clients(self):
        """Simulate a flexible search"""
        self.client.get("/api/v1/clients/search?q=Esperanza")

    @task(1)
    def verify_client_identity(self):
        """Simulate verifying a client's identity"""
        payload = {
            "address": "BELLAVISTA",
            "internet_plan_price": 40000.0
        }
        # Uses real client data (Esperanza Benitez, ID 7)
        self.client.post("/api/v1/clients/7/verify", json=payload)

    @task(1)
    def get_internet_plans(self):
        """Simulate fetching internet plans"""
        self.client.get("/api/v1/internet-plans/")

Understanding the Test Suite

User Behavior Simulation

class WispHubAPIUser(HttpUser):
    wait_time = between(1, 3)

HttpUser: Base class for simulating a user
wait_time: Random delay between requests (1-3 seconds)
Purpose: Mimics realistic user interaction patterns

Task Weights

@task(3)  # 3x weight
def get_clients(self):
    ...

@task(2)  # 2x weight
def search_clients(self):
    ...

@task(1)  # 1x weight
def verify_client_identity(self):
    ...

Weight distribution:

get_clients: 3/7 = ~43% of requests
search_clients: 2/7 = ~29% of requests
verify_client_identity: 1/7 = ~14% of requests
get_internet_plans: 1/7 = ~14% of requests

Weights reflect real-world usage patterns: client lookups are more common than identity verification.

Real Data Usage

# Uses real client data (Esperanza Benitez, ID 7)
self.client.post("/api/v1/clients/7/verify", json=payload)

The test uses actual client data from the WispHub system for realistic validation.

Running Load Tests

Local Testing

Start the API Server

In one terminal, start the API:

uvicorn app.main:app --host 0.0.0.0 --port 8000

Or with Docker:

docker run -d -p 8000:8000 --env-file .env wisphubapi:latest

Run Locust

In another terminal:

locust -f locustfile.py --host=http://localhost:8000

Expected output:

[2024-03-04 10:00:00,000] INFO/locust.main: Starting web interface at http://0.0.0.0:8089
[2024-03-04 10:00:00,001] INFO/locust.main: Starting Locust 2.43.3

Access Web UI

Open your browser to:

http://localhost:8089

You’ll see the Locust web interface.

Configure Load Test

In the web UI:

Number of users: Start with 10-50
Spawn rate: 1-5 users per second
Host: Pre-filled from --host flag

Click “Start swarming” to begin.

Command-Line Mode (Headless)

For automated testing without the web UI:

locust -f locustfile.py \
  --host=http://localhost:8000 \
  --users 50 \
  --spawn-rate 5 \
  --run-time 5m \
  --headless

Flags:

--users 50: Simulate 50 concurrent users
--spawn-rate 5: Add 5 users per second until reaching 50
--run-time 5m: Run for 5 minutes then stop
--headless: No web UI, print stats to console

Example output:

 Type     Name              # reqs      # fails |    Avg     Min     Max    Med | req/s failures/s
--------|----------------|-------|-------------|-------|-------|-------|-------|--------|-----------
 GET      /api/v1/clients/    1234            0 |      4       2      12      4 |  41.1        0.00
 GET      /api/v1/clients/search?q=Esperanza  823   0 |   5    3    15   5 | 27.4  0.00
 POST     /api/v1/clients/7/verify   411        0 |      6       3      18      5 |  13.7        0.00
 GET      /api/v1/internet-plans/    411        0 |      3       2       9      3 |  13.7        0.00
--------|----------------|-------|-------------|-------|-------|-------|-------|--------|-----------
          Aggregated       2879            0 |      5       2      18      4 |  95.9        0.00

Response time percentiles (approximated)
 Type     Name                         50%    66%    75%    80%    90%    95%    98%    99%  99.9% 99.99%   100% # reqs
--------|----------------|---------|------|------|------|------|------|------|------|------|------|------|------
 GET      /api/v1/clients/              4      5      5      6      7      8     10     11     12     12     12   1234
 GET      /api/v1/clients/search?q=Esperanza  5  6  6  7  8  9  12  14  15  15  15  823
 POST     /api/v1/clients/7/verify      5      6      7      8      9     11     14     16     18     18     18    411
 GET      /api/v1/internet-plans/       3      3      4      4      5      6      7      8      9      9      9    411
--------|----------------|---------|------|------|------|------|------|------|------|------|------|------|------
          Aggregated                    4      5      6      6      8      9     11     13     15     18     18   2879

Interpreting Results

Key Metrics

Requests per Second (RPS)

Meaning: Number of requests the API handles per secondTarget: Over 40 RPS (documented benchmark)Example:

req/s: 41.1

This indicates 41.1 requests/second for the /api/v1/clients/ endpoint.

Failure Rate

Meaning: Percentage of requests that failed (4xx/5xx errors)Target: 0.00% on cached routesExample:

# fails: 0
failures/s: 0.00

Zero failures indicates stable performance.

Response Time (Avg/Med)

Meaning: Average and median response times in millisecondsTarget:

Cached routes: Less than 10ms
Uncached routes: Less than 1000ms

Example:

Avg: 4ms
Med: 4ms

This shows the cache is working - 4ms is typical for cached responses.

Percentiles (P50, P95, P99)

Meaning: Response time at various percentiles

P50 (median): 50% of requests faster than this
P95: 95% of requests faster than this
P99: 99% of requests faster than this

Target: P95 under 10ms for cached routesExample:

50%: 4ms
95%: 8ms
99%: 11ms

This shows consistent performance with minimal outliers.

Web UI Metrics

The Locust web UI provides real-time charts:

Total Requests per Second: Overall throughput
Response Times: P50/P95 over time
Number of Users: Current user count
Failures: Failed request count

Performance Benchmarks

Documented Performance: Empirical evaluation demonstrates the server effectively handles over 40 Requests Per Second (RPS) sustaining 0.00% failure rates on read-intensive cached routes under persistent load.

Typical Results

With proper caching:

Endpoint	RPS	Avg Response	P95	Failure Rate
GET /api/v1/clients/	40+	4ms	8ms	0.00%
GET /api/v1/clients/search	30+	5ms	9ms	0.00%
POST /api/v1/clients//verify	15+	6ms	11ms	0.00%
GET /api/v1/internet-plans/	30+	3ms	6ms	0.00%

Cache Performance Validation

First load test run (cold cache):

GET /api/v1/clients/: Avg=750ms, P95=1200ms

Second load test run (warm cache):

GET /api/v1/clients/: Avg=4ms, P95=8ms

Improvement: ~187x faster with cache

Testing Cache Behavior

Test Cache TTL

Run Load Test

locust -f locustfile.py --host=http://localhost:8000 --users 20 --run-time 2m --headless

Observe response times (should be ~4ms).

Wait for Cache Expiry

Client cache TTL is 5 minutes. Wait 6 minutes.

Run Load Test Again

locust -f locustfile.py --host=http://localhost:8000 --users 20 --run-time 2m --headless

First few requests will be slower (~800ms) as cache refills, then drop to ~4ms.

Test Concurrent Cache Access

Simulate many users hitting cached data simultaneously:

locust -f locustfile.py \
  --host=http://localhost:8000 \
  --users 100 \
  --spawn-rate 20 \
  --run-time 3m \
  --headless

Expected: No cache-related errors, consistent performance.

Stress Testing

Find Breaking Point

Gradually increase load to find the failure threshold:

# Start conservative
locust -f locustfile.py --host=http://localhost:8000 --users 50 --run-time 2m --headless

# Increase load
locust -f locustfile.py --host=http://localhost:8000 --users 100 --run-time 2m --headless

# Push further
locust -f locustfile.py --host=http://localhost:8000 --users 200 --run-time 2m --headless

# Find limit
locust -f locustfile.py --host=http://localhost:8000 --users 500 --run-time 2m --headless

Watch for:

Increased error rates
Rising response times
Server resource exhaustion

Resource Monitoring

During stress tests, monitor server resources:

# CPU and memory
docker stats wisphub_api_server

# Or for local server
top -p $(pgrep -f uvicorn)

Distributed Load Testing

For testing beyond a single machine’s capacity:

Master Node

locust -f locustfile.py --host=http://localhost:8000 --master

Worker Nodes

On other machines:

locust -f locustfile.py --worker --master-host=<master-ip>

The master aggregates results from all workers.

Custom Load Test Scenarios

Create Custom Scenario

Add to locustfile.py:

class HeavyVerificationUser(HttpUser):
    """Simulates users doing mostly identity verification"""
    wait_time = between(0.5, 1.5)  # Faster paced
    
    @task(10)
    def verify_identity(self):
        self.client.post("/api/v1/clients/7/verify", json={
            "address": "BELLAVISTA",
            "internet_plan_price": 40000.0
        })
    
    @task(1)
    def search_client(self):
        self.client.get("/api/v1/clients/search?q=Test")

Run specific user class:

locust -f locustfile.py HeavyVerificationUser --host=http://localhost:8000

Test Specific Endpoints

class ClientSearchUser(HttpUser):
    """Focus on search endpoint"""
    wait_time = between(1, 2)
    
    @task
    def search_random(self):
        import random
        queries = ["Esperanza", "Rodriguez", "Martinez", "Lopez"]
        query = random.choice(queries)
        self.client.get(f"/api/v1/clients/search?q={query}")

Continuous Load Testing

Integrate load testing into CI/CD:

.github/workflows/load-test.yml

name: Load Test

on:
  schedule:
    - cron: '0 2 * * *'  # Run daily at 2 AM

jobs:
  load-test:
    runs-on: ubuntu-latest
    
    steps:
    - uses: actions/checkout@v3
    
    - name: Start API
      run: |
        docker-compose up -d
        sleep 10  # Wait for startup
    
    - name: Install Locust
      run: pip install locust==2.43.3
    
    - name: Run Load Test
      run: |
        locust -f locustfile.py \
          --host=http://localhost:8000 \
          --users 50 \
          --spawn-rate 5 \
          --run-time 5m \
          --headless \
          --csv=results/load_test
    
    - name: Check Results
      run: |
        # Fail if average response time exceeds 50ms
        python scripts/check_load_test_results.py results/load_test_stats.csv
    
    - name: Upload Results
      uses: actions/upload-artifact@v3
      with:
        name: load-test-results
        path: results/

Troubleshooting

Issue: High Failure Rate

Symptoms: Many 500 errors, high failure percentage Possible causes:

Server overloaded (reduce users)
WispHub Net timeout (increase httpx timeout)
Worker crash (check logs)

Solution: Check server logs and reduce concurrent users.

Issue: Slow Response Times

Symptoms: All requests over 100ms, even cached ones Possible causes:

Cache not working (check cache decorators)
CPU throttling (insufficient resources)
Network latency (test locally)

Solution: Verify cache is enabled and server has adequate resources.

Issue: Inconsistent Results

Symptoms: Wide variance in response times Possible causes:

Cache warming period
Background processes
Garbage collection pauses

Solution: Run longer tests (over 5 minutes) to see stable patterns.

Best Practices

Start Small

Begin with 10-20 users and gradually increase to find limits

Monitor Resources

Watch CPU, memory, and network during tests

Use Realistic Data

Test with production-like data volumes and patterns

Test Regularly

Run load tests before releases and on schedule

Testing Overview

Learn about unit and integration testing

Caching System

Understand the caching implementation

Get Started

Core Concepts

Deployment

Testing

​Load Testing

​Why Load Testing?

​Locust Overview

​Load Test Configuration

​Understanding the Test Suite

​User Behavior Simulation

​Task Weights

​Real Data Usage

​Running Load Tests

​Local Testing

​Command-Line Mode (Headless)

​Interpreting Results

​Key Metrics

​Web UI Metrics

​Performance Benchmarks

​Typical Results

​Cache Performance Validation

​Testing Cache Behavior

​Test Cache TTL

​Test Concurrent Cache Access

​Stress Testing

​Find Breaking Point

​Resource Monitoring

​Distributed Load Testing

​Master Node

​Worker Nodes

​Custom Load Test Scenarios

​Create Custom Scenario

​Test Specific Endpoints

​Continuous Load Testing

​Troubleshooting

​Issue: High Failure Rate

​Issue: Slow Response Times

​Issue: Inconsistent Results

​Best Practices

Start Small

Monitor Resources

Use Realistic Data

Test Regularly

​Related Topics

Testing Overview

Caching System

Build docs developers (and LLMs) love

Load Testing

Why Load Testing?

Locust Overview

Load Test Configuration

Understanding the Test Suite

User Behavior Simulation

Task Weights

Real Data Usage

Running Load Tests

Local Testing

Command-Line Mode (Headless)

Interpreting Results

Key Metrics

Web UI Metrics

Performance Benchmarks

Typical Results

Cache Performance Validation

Testing Cache Behavior

Test Cache TTL

Test Concurrent Cache Access

Stress Testing

Find Breaking Point

Resource Monitoring

Distributed Load Testing

Master Node

Worker Nodes

Custom Load Test Scenarios

Create Custom Scenario

Test Specific Endpoints

Continuous Load Testing

Troubleshooting

Issue: High Failure Rate

Issue: Slow Response Times

Issue: Inconsistent Results

Best Practices

Related Topics