Overview
The server deployment configuration is optimized for high-throughput batch processing on machines with abundant resources. This mode prioritizes speed and model quality over memory conservation.Resource Configuration
Default Constraints
Server deployments leverage available compute and memory:- Memory limit: 4096 MB (4 GB)
- Compute units: 1.0 (100% utilization)
- Chunk size: 256 rows per chunk
- Batch size: 512 rows
- Parallel jobs: 4 (concurrent processing)
Performance Characteristics
- Higher throughput: 500-2000+ rows/second depending on hardware
- Lower latency: Larger chunks reduce overhead
- Better model quality: Full compute capacity enables more sophisticated features
- Parallel benchmarking: Multiple constraint experiments run concurrently
Configuration
Server Configuration Template
Location:configs/pipeline.server.template.json
Configuration Parameters
| Parameter | Value | Purpose |
|---|---|---|
chunk_size | 256 | Larger chunks for better throughput |
batch_size | 512 | Larger batches reduce model update overhead |
n_jobs | 4 | Parallel execution for constraint experiments |
max_memory_mb | 4096 | Generous memory allocation |
max_compute_units | 1.0 | Full CPU utilization |
benchmark_runs | 5 | More runs for statistical significance |
spill_to_disk | false | Memory is sufficient, avoid I/O overhead |
Running on Servers
Basic Deployment
High-Memory Configuration
For servers with 16+ GB RAM:Maximum Parallelism
Utilize all CPU cores:Setting
--n-jobs -1 uses all available CPU cores. This significantly accelerates constraint experiments but may increase timing variance.Performance Optimization
Scaling Parameters
The pipeline automatically adjusts batch and chunk sizes based on resource constraints:- Memory factor: 1.0 (4096 / 1024)
- Compute factor: 1.0
- No downscaling applied - full performance
Parallel Benchmark Execution
Constraint experiments run in parallel whenn_jobs > 1:
Deployment Architecture
Recommended Server Specifications
Minimum:- 4 CPU cores
- 8 GB RAM
- 10 GB free disk space
- SSD for faster CSV ingestion
- 8+ CPU cores
- 16+ GB RAM
- 50 GB free disk space
- NVMe SSD for optimal I/O
Docker Deployment
Example Dockerfile:Cloud Deployment
AWS EC2
Recommended instance: m5.xlarge (4 vCPUs, 16 GB RAM)GCP Compute Engine
Recommended machine: n2-standard-4 (4 vCPUs, 16 GB RAM)Energy and Telemetry
RAPL Energy Monitoring
On Intel/AMD servers, the pipeline uses RAPL (Running Average Power Limit) for accurate energy measurements:- Batch mode: ~45J per run
- Streaming mode: ~30J per run
Hardware Telemetry
Each run captures:- CPU utilization snapshots
- Memory usage before/after
- Energy consumption (RAPL or estimate)
- Operator-level profiling
Deployment Checklist
Analyze Performance
Review artifacts:
reports/pipeline_report.json- Overall metricsbenchmarks/constraint_experiment.csv- Performance sweepbenchmarks/*.png- Visualization plots
Tune for Production
Based on baseline:
- Increase
chunk_sizeif memory allows - Increase
n_jobsfor faster sweeps - Adjust
benchmark_runsfor desired confidence
Troubleshooting
Low Throughput
Symptom: Throughput below 500 rows/second on capable hardware Solutions:- Increase
chunk_sizeto 512 or higher - Verify SSD storage (not HDD) for CSV ingestion
- Check for background processes consuming CPU/memory
- Disable
spill_to_diskif accidentally enabled
Memory Pressure
Symptom: Unexpected memory exceeded warnings Solutions:- Verify actual available memory:
free -m - Check for memory leaks in other processes
- Increase
max_memory_mbto 8192 or higher - Review chunk metrics in
reports/streaming_chunks.jsonl
Timing Variance in Benchmarks
Symptom: High standard deviation in latency measurements Solutions:- Increase
benchmark_runsto 10+ for more stable estimates - Reduce
n_jobsto 1 for strict timing stability - Pin CPU affinity to avoid scheduler interference
- Run during off-peak hours to reduce contention
RAPL Unavailable
Symptom: Energy measurements show fallback estimates Solutions:- Run with elevated privileges:
sudo python run_pipeline.py ... - In Docker, use
--privilegedflag - In VMs, enable MSR access in hypervisor settings
- Accept fallback estimates (still useful for relative comparisons)
Best Practices
Disable disk spill on servers
Disable disk spill on servers
Set
spill_to_disk: false to avoid unnecessary I/O overhead. Server memory should be sufficient to hold all intermediate results.Use parallel execution for experiments
Use parallel execution for experiments
Set
n_jobs: 4 or higher to accelerate constraint experiments. This reduces total benchmark time by 3-4x.Persist artifacts to network storage
Persist artifacts to network storage
Configure
output_dir to point to NFS/S3 for long-term retention:Monitor with external tools
Monitor with external tools
For production deployments, integrate with monitoring:
- Prometheus for metrics collection
- Grafana for visualization
- Parse
pipeline_report.jsonfor custom dashboards
Run reproducibility checks
Run reproducibility checks
Validate deterministic behavior:
Related Documentation
- Edge Device Deployment - Resource-constrained configuration
- Output Artifacts - Understanding pipeline outputs
- Hardware Profiling - Detailed telemetry analysis