Server Deployment

Overview

The server deployment configuration is optimized for high-throughput batch processing on machines with abundant resources. This mode prioritizes speed and model quality over memory conservation.

Resource Configuration

Default Constraints

Server deployments leverage available compute and memory:

Memory limit: 4096 MB (4 GB)
Compute units: 1.0 (100% utilization)
Chunk size: 256 rows per chunk
Batch size: 512 rows
Parallel jobs: 4 (concurrent processing)

Performance Characteristics

Higher throughput: 500-2000+ rows/second depending on hardware
Lower latency: Larger chunks reduce overhead
Better model quality: Full compute capacity enables more sophisticated features
Parallel benchmarking: Multiple constraint experiments run concurrently

Configuration

Server Configuration Template

Location: configs/pipeline.server.template.json

{
  "random_seed": 42,
  "chunk_size": 256,
  "batch_size": 512,
  "n_jobs": 4,
  "max_memory_mb": 4096,
  "max_compute_units": 1.0,
  "benchmark_runs": 5,
  "adaptive_chunk_resize": true,
  "max_chunk_retries": 3,
  "spill_to_disk": false,
  "output_dir": "artifacts_server"
}

Configuration Parameters

Parameter	Value	Purpose
`chunk_size`	256	Larger chunks for better throughput
`batch_size`	512	Larger batches reduce model update overhead
`n_jobs`	4	Parallel execution for constraint experiments
`max_memory_mb`	4096	Generous memory allocation
`max_compute_units`	1.0	Full CPU utilization
`benchmark_runs`	5	More runs for statistical significance
`spill_to_disk`	false	Memory is sufficient, avoid I/O overhead

Running on Servers

Basic Deployment

cd "NBA Data Preprocessing/task"
python run_pipeline.py \
  --input ../data/nba2k-full.csv \
  --config-template ../../configs/pipeline.server.template.json

High-Memory Configuration

For servers with 16+ GB RAM:

python run_pipeline.py \
  --input ../data/nba2k-full.csv \
  --config-template ../../configs/pipeline.server.template.json \
  --max-memory-mb 8192 \
  --chunk-size 512 \
  --batch-size 1024

Maximum Parallelism

Utilize all CPU cores:

python run_pipeline.py \
  --input ../data/nba2k-full.csv \
  --config-template ../../configs/pipeline.server.template.json \
  --n-jobs -1

Setting --n-jobs -1 uses all available CPU cores. This significantly accelerates constraint experiments but may increase timing variance.

Performance Optimization

Scaling Parameters

The pipeline automatically adjusts batch and chunk sizes based on resource constraints:

# From engine.py:55-60
memory_factor = max(0.1, min(1.0, memory_cap / 1024))
compute_factor = max(0.1, min(1.0, compute_cap))
scale = memory_factor * compute_factor
adjusted_batch = max(16, int(batch_base * scale))
adjusted_chunk = max(16, int(chunk_base * scale))

With server settings (4096 MB, 1.0 compute):

Memory factor: 1.0 (4096 / 1024)
Compute factor: 1.0
No downscaling applied - full performance

Parallel Benchmark Execution

Constraint experiments run in parallel when n_jobs > 1:

# From engine.py:397-402
if self.config.n_jobs > 1:
    experiment_rows = Parallel(n_jobs=self.config.n_jobs)(
        delayed(self._single_constraint_run)(source, c, m, cp) for c, m, cp in tasks
    )
else:
    experiment_rows = [self._single_constraint_run(source, c, m, cp) for c, m, cp in tasks]

With n_jobs: 4, a 12-configuration constraint sweep completes in ~1/4 the time compared to sequential execution.

Deployment Architecture

Recommended Server Specifications

Minimum:

4 CPU cores
8 GB RAM
10 GB free disk space
SSD for faster CSV ingestion

Recommended:

8+ CPU cores
16+ GB RAM
50 GB free disk space
NVMe SSD for optimal I/O

Docker Deployment

Example Dockerfile:

FROM python:3.10-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

ENV MAX_MEMORY_MB=4096
ENV N_JOBS=4

CMD ["python", "NBA Data Preprocessing/task/run_pipeline.py", \
     "--input", "data/nba2k-full.csv", \
     "--config-template", "configs/pipeline.server.template.json"]

Run container:

docker build -t nba-pipeline:server .
docker run --cpus=4 --memory=8g \
  -v $(pwd)/artifacts:/app/artifacts_server \
  nba-pipeline:server

Cloud Deployment

AWS EC2

Recommended instance: m5.xlarge (4 vCPUs, 16 GB RAM)

# Launch instance
aws ec2 run-instances \
  --image-id ami-xxxxxxxxx \
  --instance-type m5.xlarge \
  --key-name your-key \
  --security-groups your-sg

# SSH and run
ssh -i your-key.pem ec2-user@instance-ip
cd nba-pipeline
python "NBA Data Preprocessing/task/run_pipeline.py" \
  --config-template configs/pipeline.server.template.json

GCP Compute Engine

Recommended machine: n2-standard-4 (4 vCPUs, 16 GB RAM)

gcloud compute instances create nba-pipeline-server \
  --machine-type=n2-standard-4 \
  --zone=us-central1-a \
  --image-family=ubuntu-2204-lts \
  --image-project=ubuntu-os-cloud

Energy and Telemetry

RAPL Energy Monitoring

On Intel/AMD servers, the pipeline uses RAPL (Running Average Power Limit) for accurate energy measurements:

# From engine.py:298-299
telemetry = self.hardware.compare(start_snapshot, end_snapshot)
telemetry['fallback_energy_estimate_j'] = total_elapsed * 30.0

Typical energy consumption:

Batch mode: ~45J per run
Streaming mode: ~30J per run

RAPL may be unavailable in:

Docker containers (unless privileged)
VMs without MSR access
ARM-based servers

Fallback estimation is used automatically.

Hardware Telemetry

Each run captures:

CPU utilization snapshots
Memory usage before/after
Energy consumption (RAPL or estimate)
Operator-level profiling

Deployment Checklist

Verify System Resources

# Check available memory
free -m

# Check CPU cores
nproc

# Check disk space
df -h

Install Dependencies

cd "NBA Data Preprocessing"
pip install -r requirements.txt

Run Validation Tests

cd "NBA Data Preprocessing/task"
python -m unittest discover -s test -p 'test_*.py'

Run Baseline Benchmark

python run_pipeline.py \
  --input ../data/nba2k-full.csv \
  --config-template ../../configs/pipeline.server.template.json \
  --benchmark-runs 3

Analyze Performance

Review artifacts:

reports/pipeline_report.json - Overall metrics
benchmarks/constraint_experiment.csv - Performance sweep
benchmarks/*.png - Visualization plots

Tune for Production

Based on baseline:

Increase chunk_size if memory allows
Increase n_jobs for faster sweeps
Adjust benchmark_runs for desired confidence

Configure Persistent Storage

python run_pipeline.py \
  --input ../data/nba2k-full.csv \
  --config-template ../../configs/pipeline.server.template.json \
  --output-dir /mnt/nfs/nba_artifacts

Troubleshooting

Low Throughput

Symptom: Throughput below 500 rows/second on capable hardware Solutions:

Increase chunk_size to 512 or higher
Verify SSD storage (not HDD) for CSV ingestion
Check for background processes consuming CPU/memory
Disable spill_to_disk if accidentally enabled

Memory Pressure

Symptom: Unexpected memory exceeded warnings Solutions:

Verify actual available memory: free -m
Check for memory leaks in other processes
Increase max_memory_mb to 8192 or higher
Review chunk metrics in reports/streaming_chunks.jsonl

Timing Variance in Benchmarks

Symptom: High standard deviation in latency measurements Solutions:

Increase benchmark_runs to 10+ for more stable estimates
Reduce n_jobs to 1 for strict timing stability
Pin CPU affinity to avoid scheduler interference
Run during off-peak hours to reduce contention

RAPL Unavailable

Symptom: Energy measurements show fallback estimates Solutions:

Run with elevated privileges: sudo python run_pipeline.py ...
In Docker, use --privileged flag
In VMs, enable MSR access in hypervisor settings
Accept fallback estimates (still useful for relative comparisons)

Best Practices

Disable disk spill on servers

Set spill_to_disk: false to avoid unnecessary I/O overhead. Server memory should be sufficient to hold all intermediate results.

Use parallel execution for experiments

Set n_jobs: 4 or higher to accelerate constraint experiments. This reduces total benchmark time by 3-4x.

Persist artifacts to network storage

Configure output_dir to point to NFS/S3 for long-term retention:

--output-dir /mnt/nfs/experiments/$(date +%Y%m%d_%H%M%S)

Monitor with external tools

For production deployments, integrate with monitoring:

Prometheus for metrics collection
Grafana for visualization
Parse pipeline_report.json for custom dashboards

Run reproducibility checks

Validate deterministic behavior:

for i in {1..3}; do
  python run_pipeline.py --input ../data/nba2k-full.csv --random-seed 42
done
# Compare dataset_fingerprint in reports

Edge Device Deployment - Resource-constrained configuration
Output Artifacts - Understanding pipeline outputs
Hardware Profiling - Detailed telemetry analysis

Get Started

Core Concepts

Pipeline Stages

Configuration

Performance

Deployment

Overview

Resource Configuration

Default Constraints

Performance Characteristics

Configuration

Server Configuration Template

Configuration Parameters

Running on Servers

Basic Deployment

High-Memory Configuration

Maximum Parallelism

Performance Optimization

Scaling Parameters

Parallel Benchmark Execution

Deployment Architecture

Recommended Server Specifications

Docker Deployment

Cloud Deployment

AWS EC2

GCP Compute Engine

Energy and Telemetry

RAPL Energy Monitoring

Hardware Telemetry

Deployment Checklist

Troubleshooting

Low Throughput

Memory Pressure

Timing Variance in Benchmarks

RAPL Unavailable

Best Practices

Build docs developers (and LLMs) love

Get Started

Core Concepts

Pipeline Stages

Configuration

Performance

Deployment

​Overview

​Resource Configuration

​Default Constraints

​Performance Characteristics

​Configuration

​Server Configuration Template

​Configuration Parameters

​Running on Servers

​Basic Deployment

​High-Memory Configuration

​Maximum Parallelism

​Performance Optimization

​Scaling Parameters

​Parallel Benchmark Execution

​Deployment Architecture

​Recommended Server Specifications

​Docker Deployment

​Cloud Deployment

​AWS EC2

​GCP Compute Engine

​Energy and Telemetry

​RAPL Energy Monitoring

​Hardware Telemetry

​Deployment Checklist

​Troubleshooting

​Low Throughput

​Memory Pressure

​Timing Variance in Benchmarks

​RAPL Unavailable

​Best Practices

​Related Documentation

Build docs developers (and LLMs) love

Overview

Resource Configuration

Default Constraints

Performance Characteristics

Configuration

Server Configuration Template

Configuration Parameters

Running on Servers

Basic Deployment

High-Memory Configuration

Maximum Parallelism

Performance Optimization

Scaling Parameters

Parallel Benchmark Execution

Deployment Architecture

Recommended Server Specifications

Docker Deployment

Cloud Deployment

AWS EC2

GCP Compute Engine

Energy and Telemetry

RAPL Energy Monitoring

Hardware Telemetry

Deployment Checklist

Troubleshooting

Low Throughput

Memory Pressure

Timing Variance in Benchmarks

RAPL Unavailable

Best Practices

Related Documentation