Stress Testing Overview
ML Defender has been validated with 36,000 events across progressive stress tests, demonstrating production-grade stability and graceful degradation under extreme load.
Events Processed 36,000 total events
Peak Throughput 364.9 events/second
Errors 0 crypto/decompression errors
Test Design
Progressive Load Tests (Day 52)
From source/README.md:179-203:
ML Defender underwent 4 progressive stress tests on Day 52:
Test Events Rate CPU Duration Result 1 1,000 42.6/sec N/A ~23 sec ✅ PASS 2 5,000 94.9/sec N/A ~53 sec ✅ PASS 3 10,000 176.1/sec 41-45% ~57 sec ✅ PASS 4 20,000 364.9/sec 49-54% ~55 sec ✅ PASS
Total Results (36K events) :
crypto_errors: 0 ← Perfect ChaCha20-Poly1305 pipeline
decompression_errors: 0 ← Perfect LZ4 pipeline
protobuf_parse_errors: 0 ← Perfect message parsing
ipset_successes: 118 ← First ~1000 IPs blocked
ipset_failures: 16,681 ← Capacity limit (not a bug)
max_queue_depth: 16,690 ← Backpressure handled gracefully
Key Discoveries
Production-Ready Components:
✅ Crypto pipeline: 0 errors at 36K events
✅ CPU efficiency: 54% max under extreme load
✅ Memory stable: 127 MB RSS
✅ Graceful degradation: No crashes when capacity exceeded
Capacity Planning Insights:
IPSet capacity finite (max realistic: 500K IPs)
After ~1000 IPs, insertions fail (expected behavior)
System exhibits backpressure without crashes
8-Hour Stability Test
Test Configuration
From source/stress_test_8h.sh:
# Test parameters
TEST_DURATION_MINUTES = 480 # 8 hours
TRAFFIC_RATE_PPS = 75 # 75 packets/second target
MONITORING_INTERVAL = 60 # Monitor every 60s
# Components tested
- sniffer (eBPF/XDP packet capture )
- ml-detector (RandomForest inference )
- Synthetic traffic generator
- Resource monitor
Test Phases
From source/stress_test_8h.sh:575-613 (traffic_generator_full.sh):
Phase 1: Warm-up (30 min)
Low load, gradual increase
HTTP/HTTPS requests (5/interval)
DNS queries (10/interval)
ICMP ping (3/interval)
Phase 2: Normal Load (2 hours)
Mixed protocols
HTTP/HTTPS: 10 requests/interval
DNS queries: 15/interval
Ping traffic: 5/interval
Simulates typical network behavior
Phase 3: Stress Testing (1.5 hours)
High sustained load (3-min cycles)
HTTP/HTTPS: 20 requests/interval
DNS: 30/interval
Stress bursts : 50 concurrent requests/minute
High-entropy traffic: 10/interval
Phase 4: Ransomware Simulation (1 hour)
Fake C2 connections: 10/interval
SMB lateral movement: 15/interval
Encrypted payloads: 20/interval
Tests detection accuracy under attack
Phase 5: Sustained Load (3 hours)
Continuous moderate traffic
HTTP/HTTPS: 12/interval
DNS: 20/interval
Tests long-term stability
Phase 6: Cool Down (30 min)
Gradual traffic reduction
Verify clean shutdown
Running Stress Tests
Quick Stress Test (10 min)
# Start all components
make run-lab-dev
# In another terminal, run stress test
cd tools/build
./synthetic_ml_output_injector 10000 200
# 10,000 events at 200 events/sec
# Monitor firewall logs
tail -f /vagrant/logs/lab/firewall-agent.log | grep "events_processed"
Full 8-Hour Test
Automated Script
Manual Execution
# Run full 8-hour stress test
./stress_test_8h.sh
# Test artifacts saved to:
/vagrant/stress_test_ <timestamp>/
├── logs/
│ ├── sniffer.log
│ ├── ml_detector.log
│ ├── traffic.log
│ └── monitor.log
├── monitoring/
│ ├── cpu_usage.csv
│ ├── memory_usage.csv
│ └── network_stats.csv
└── REPORT.md
Load Profiles
Hospital Benchmark
From source/scripts/day11_hospital_benchmark/:
ML Defender includes realistic hospital traffic profiles:
EHR Load
Emergency Burst
PACS Imaging
Electronic Health Records
Simulates typical EHR system traffic.scripts/day11_hospital_benchmark/traffic_profiles/ehr_load.sh
Characteristics :
HTTP/HTTPS requests to EHR API
Database queries (simulated)
Average: 50 requests/sec
Peak: 120 requests/sec
Emergency Department Traffic
Simulates emergency room rush hours.scripts/day11_hospital_benchmark/traffic_profiles/emergency_test.sh
Characteristics :
Burst traffic (200+ req/sec)
Multiple concurrent admissions
High database load
Duration: 15-30 minutes
Medical Imaging (PACS)
Large file transfers (X-rays, MRIs).scripts/day11_hospital_benchmark/traffic_profiles/pacs_burst.sh
Characteristics :
Large payload sizes (10-100 MB)
Sustained throughput
Network bandwidth test
Duration: 1-2 hours
Custom Traffic Mix
From source/stress_test_8h.sh:616-629:
# Traffic distribution
Protocol Distribution:
HTTP/HTTPS: 40%
DNS: 30%
ICMP: 15%
SMB (TCP 445 ): 10%
Other: 5%
Payload Types:
Normal text: 25%
Encrypted: 50%
Random: 20%
PE executable: 5%
Metrics Collection
Monitoring Script
From source/stress_test_monitor.sh:
#!/bin/bash
# Resource monitoring for stress tests
SNIFFER_PID = $1
ML_DETECTOR_PID = $2
OUTPUT_DIR = $3
INTERVAL = $4 # seconds
mkdir -p " $OUTPUT_DIR "
echo "timestamp,cpu_sniffer,cpu_detector,mem_sniffer_mb,mem_detector_mb" > " $OUTPUT_DIR /resources.csv"
while true ; do
TIMESTAMP = $( date +%s )
# CPU usage
CPU_SNIFFER = $( ps -p $SNIFFER_PID -o %cpu= | tr -d ' ' )
CPU_DETECTOR = $( ps -p $ML_DETECTOR_PID -o %cpu= | tr -d ' ' )
# Memory usage (RSS in MB)
MEM_SNIFFER = $( ps -p $SNIFFER_PID -o rss= | awk '{print $1/1024}' )
MEM_DETECTOR = $( ps -p $ML_DETECTOR_PID -o rss= | awk '{print $1/1024}' )
echo " $TIMESTAMP , $CPU_SNIFFER , $CPU_DETECTOR , $MEM_SNIFFER , $MEM_DETECTOR " >> " $OUTPUT_DIR /resources.csv"
# Network stats
ifconfig eth0 | grep -E 'RX packets|TX packets' > " $OUTPUT_DIR /network_ $TIMESTAMP .txt"
sleep $INTERVAL
done
Collected Metrics
CPU usage per component (%)
Memory usage (RSS MB)
Network throughput (packets/sec, bytes/sec)
Disk I/O (reads/writes)
Context switches
Events processed (total count)
Processing rate (events/sec)
Queue depth (current backlog)
Error counts (crypto, decompression, parse)
Latency (p50, p95, p99)
IPSet insertions (successes/failures)
IPs blocked (total count)
Iptables rule evaluations
Batch processing stats
17-Hour Sniffer Stability Test
From source/TESTING.md:164-182:
╔═══════════════════════════════════════════════════════════════╗
║ 17-HOUR STABILITY TEST - FINAL RESULTS ║
╚═══════════════════════════════════════════════════════════════╝
Total Runtime: 17h 2m 10s (61,343 seconds)
Total Packets Processed: 2,080,549
Payloads Analyzed: 1,550,375 (74.5%)
Peak Throughput: 82.35 events/second
Average Throughput: 33.92 events/second
Memory Footprint: 4.5 MB (STABLE)
CPU Usage (load): 5-10%
CPU Usage (idle): 0%
Kernel Panics: 0
Segmentation Faults: 0
Memory Leaks: 0
Process Restarts: 0
Status: ✅ PRODUCTION-READY
Component Latency Breakdown
From source/TESTING.md:282-296:
Component Latency Cumulative eBPF capture <1 μs 1 μs Ring buffer <1 μs 2 μs PayloadAnalyzer (fast) 1 μs 3 μs FastDetector <1 μs 4 μs Protobuf serialize ~10 μs 14 μs ZMQ PUSH ~50 μs 64 μs
End-to-end latency : ~64 μs (normal path)
Capacity Planning
Resource Recommendations
Based on 36K event stress testing and 17-hour stability validation:
Small Deployment (Home/SMB) :
CPU : 2 cores @ 2.5 GHz
RAM : 4 GB
Network : 100 Mbps
Capacity : 50-100 events/sec
Medium Deployment (Enterprise) :
CPU : 4 cores @ 3.0 GHz
RAM : 8 GB
Network : 1 Gbps
Capacity : 200-500 events/sec
Large Deployment (Hospital/ISP) :
CPU : 8 cores @ 3.5 GHz
RAM : 16 GB
Network : 10 Gbps
Capacity : 1000+ events/sec
IPSet Capacity Planning
From source/README.md:196-203:
IPSet Limits:
Default max: 65,536 IPs
Realistic max: 500,000 IPs
Tested capacity: 1,000 IPs (before failures)
Recommendations:
- Implement multi-tier storage (IPSet → SQLite → Parquet)
- Auto-eviction policy (LRU, time-based)
- Capacity monitoring and alerts
Stress Test Analysis
Automated Report Generation
From source/stress_test_8h.sh:138-200:
# Generate test report
generate_report () {
REPORT = "${ TEST_DIR }/REPORT.md"
cat > "${ REPORT }" << EOF
# ML Defender - 8 Hour Stress Test Report
## Test Summary
- Start: $( cat ${ TEST_DIR }/test_info.txt | grep "Start time" | cut -d: -f2- )
- Duration: ${ HOURS }h ${ MINUTES }m ${ SECONDS }s
- Components: Sniffer + ML-Detector
## Results
### Event Processing
\`\`\`
Total events: $( grep "events_processed" ${ LOGS_DIR }/ml_detector.log | tail -1 | awk '{print $NF}')
Average rate: $( echo "scale=2; $( grep "events_processed" ${ LOGS_DIR }/ml_detector.log | tail -1 | awk '{print $NF}') / ${ ACTUAL_RUNTIME }" | bc ) events/sec
Peak rate: $( grep "events/sec" ${ LOGS_DIR }/ml_detector.log | awk '{print $NF}' | sort -n | tail -1 ) events/sec
\`\`\`
### Resource Usage
\`\`\`
Peak CPU: $( awk -F ',' '{print $2}' ${ MONITORING_DIR }/resources.csv | sort -n | tail -1 )%
Peak Memory: $( awk -F ',' '{print $4}' ${ MONITORING_DIR }/resources.csv | sort -n | tail -1 ) MB
Average CPU: $( awk -F ',' '{sum+=$2; count++} END {print sum/count}' ${ MONITORING_DIR }/resources.csv)%
Average Memory: $( awk -F ',' '{sum+=$4; count++} END {print sum/count}' ${ MONITORING_DIR }/resources.csv) MB
\`\`\`
### Errors
\`\`\`
Crashes: 0
Segfaults: 0
Memory leaks: 0
Crypto errors: $( grep "crypto_error" ${ LOGS_DIR }/ml_detector.log | wc -l )
\`\`\`
## Conclusion
✅ Test PASSED - System stable under 8-hour stress test
EOF
echo "✅ Report generated: ${ REPORT }"
}
Next Steps
Testing Guide Run unit and integration tests
Performance Tuning Optimize system performance
Monitoring Set up production monitoring
Troubleshooting Debug performance issues