Edge Device Deployment

Overview

The edge device deployment configuration is optimized for resource-constrained environments such as embedded systems, IoT devices, or low-power compute nodes. This mode prioritizes memory safety and resilience over maximum throughput.

Memory and Compute Constraints

Default Constraints

Edge deployments operate under strict resource limits:

Memory limit: 256 MB maximum
Compute units: 0.4 (40% of available compute capacity)
Chunk size: 64 rows per chunk
Batch size: 96 rows
Parallel jobs: 1 (sequential processing)

Adaptive Resource Management

The pipeline automatically adjusts to available resources:

Adaptive chunk resizing: Enabled by default
Maximum chunk retries: 3 attempts with exponential backoff
Spill to disk: Enabled to prevent out-of-memory failures
Memory monitoring: Per-chunk memory tracking with automatic downscaling

Configuration

Edge Configuration Template

Location: configs/pipeline.edge.template.json

{
  "random_seed": 42,
  "chunk_size": 64,
  "batch_size": 96,
  "n_jobs": 1,
  "max_memory_mb": 256,
  "max_compute_units": 0.4,
  "benchmark_runs": 3,
  "adaptive_chunk_resize": true,
  "max_chunk_retries": 3,
  "spill_to_disk": true,
  "output_dir": "artifacts_edge"
}

Configuration Parameters

Parameter	Value	Purpose
`chunk_size`	64	Small chunks reduce memory pressure
`batch_size`	96	Conservative batch size for model updates
`n_jobs`	1	Sequential processing avoids contention
`max_memory_mb`	256	Hard memory cap for safety
`max_compute_units`	0.4	Leave headroom for system processes
`adaptive_chunk_resize`	true	Automatic downscaling on memory pressure
`spill_to_disk`	true	Persist intermediate results to storage

Running on Edge Devices

Basic Deployment

cd "NBA Data Preprocessing/task"
python run_pipeline.py \
  --input ../data/nba2k-full.csv \
  --config-template ../../configs/pipeline.edge.template.json

Custom Memory Limit

python run_pipeline.py \
  --input ../data/nba2k-full.csv \
  --config-template ../../configs/pipeline.edge.template.json \
  --max-memory-mb 128

Disable Disk Spill (Risk)

python run_pipeline.py \
  --input ../data/nba2k-full.csv \
  --config-template ../../configs/pipeline.edge.template.json \
  --no-spill

Performance Characteristics

Expected Behavior

Latency: Higher per-row latency due to small chunks and retry logic
Throughput: 50-200 rows/second depending on hardware
Memory: Stays within 256 MB limit with adaptive resizing
Energy: ~30J estimated per run on typical ARM devices

Failure Modes

Common edge device failures:

Out of memory: Occurs when adaptive_chunk_resize is disabled
Slow I/O: CSV ingestion becomes bottleneck on SD cards
RAPL unavailable: Energy counters not available; fallback estimate used
Model quality degradation: Aggressive downscaling can reduce R² scores

Adaptive Chunk Resizing

When memory exceeds the limit, the pipeline automatically:

Detects memory pressure after processing each chunk
Splits pending chunk in half (minimum 16 rows)
Retries up to 3 times with exponential backoff (50ms, 100ms, 200ms)
Processes smaller chunk with reduced memory footprint

From engine.py:251-258:

if memory_exceeded and self.config.adaptive_chunk_resize and retries < self.config.max_chunk_retries and len(chunk) > 16:
    retries += 1
    split = max(16, len(chunk) // 2)
    pending_chunks.insert(0, chunk.iloc[split:].copy())
    chunk = chunk.iloc[:split].copy()
    current_chunk_size = split
    time.sleep(min(0.05 * retries, 0.2))
    continue

Disk Spill Strategy

When spill_to_disk is enabled, intermediate results are persisted:

X features: intermediate/stream_chunk_{id}_X.csv
y targets: intermediate/stream_chunk_{id}_y.csv

This prevents data loss during memory pressure but increases I/O amplification.

Disk spill adds 10-50% latency overhead on slow storage (e.g., SD cards) but provides resilience against OOM failures.

Deployment Checklist

Verify System Requirements

Python 3.8+ installed
Required dependencies from requirements.txt
At least 256 MB available RAM
Storage space for input data + artifacts (typically 50-100 MB)

Test Memory Constraints

# Run with strict memory limit
python run_pipeline.py \
  --input ../data/nba2k-full.csv \
  --max-memory-mb 128 \
  --benchmark-runs 1

Monitor First Run

Check for:

Memory exceeded warnings in reports/streaming_chunks.jsonl
Retry count in chunk metrics
Spill file creation in intermediate/ directory

Tune Configuration

Adjust based on first run:

Reduce chunk_size if frequent retries occur
Reduce max_memory_mb if system becomes unstable
Disable spill_to_disk if storage is limited

Validate Reproducibility

# Run twice with same seed
python run_pipeline.py --input ../data/nba2k-full.csv --random-seed 42
python run_pipeline.py --input ../data/nba2k-full.csv --random-seed 42

# Compare fingerprints
diff artifacts_edge/metadata/run_manifest.json artifacts_edge_2/metadata/run_manifest.json

Set Up Persistent Storage

Configure external storage for long-term artifact retention:

--output-dir /mnt/external/nba_artifacts

Troubleshooting

High Memory Usage

Symptom: Memory exceeds limit frequently Solutions:

Reduce chunk_size to 32 or 16
Lower max_memory_mb to trigger adaptive resizing earlier
Enable spill_to_disk if disabled

Slow Processing

Symptom: Throughput below 50 rows/second Solutions:

Check storage I/O (disable spill_to_disk if possible)
Increase chunk_size slightly if memory allows
Verify no background processes consuming CPU

Model Quality Degradation

Symptom: R² score significantly lower than server deployment Solutions:

Increase max_compute_units to 0.5 or 0.6
Increase benchmark_runs for more stable estimates
Review chunk metrics for excessive retries

Best Practices

Always enable adaptive resizing

Keep adaptive_chunk_resize: true to prevent out-of-memory failures. Only disable if you have strict latency requirements and have validated memory safety.

Use disk spill on production devices

Enable spill_to_disk: true for production deployments to ensure resilience. The latency overhead is acceptable compared to pipeline failures.

Run validation tests first

Before deploying to edge devices, run unit tests:

cd "NBA Data Preprocessing/task"
python -m unittest discover -s test -p 'test_*.py'

Monitor energy consumption

On ARM devices without RAPL, the pipeline uses a fallback estimate (~30J). For precise measurements, use external power meters.

Server Deployment - High-throughput configuration
Output Artifacts - Understanding pipeline outputs
Hardware Profiling - Detailed telemetry analysis

Get Started

Core Concepts

Pipeline Stages

Configuration

Performance

Deployment

Overview

Memory and Compute Constraints

Default Constraints

Adaptive Resource Management

Configuration

Edge Configuration Template

Configuration Parameters

Running on Edge Devices

Basic Deployment

Custom Memory Limit

Disable Disk Spill (Risk)

Performance Characteristics

Expected Behavior

Failure Modes

Adaptive Chunk Resizing

Disk Spill Strategy

Deployment Checklist

Troubleshooting

High Memory Usage

Slow Processing

Model Quality Degradation

Best Practices

Build docs developers (and LLMs) love

Get Started

Core Concepts

Pipeline Stages

Configuration

Performance

Deployment

​Overview

​Memory and Compute Constraints

​Default Constraints

​Adaptive Resource Management

​Configuration

​Edge Configuration Template

​Configuration Parameters

​Running on Edge Devices

​Basic Deployment

​Custom Memory Limit

​Disable Disk Spill (Risk)

​Performance Characteristics

​Expected Behavior

​Failure Modes

​Adaptive Chunk Resizing

​Disk Spill Strategy

​Deployment Checklist

​Troubleshooting

​High Memory Usage

​Slow Processing

​Model Quality Degradation

​Best Practices

​Related Documentation

Build docs developers (and LLMs) love

Overview

Memory and Compute Constraints

Default Constraints

Adaptive Resource Management

Configuration

Edge Configuration Template

Configuration Parameters

Running on Edge Devices

Basic Deployment

Custom Memory Limit

Disable Disk Spill (Risk)

Performance Characteristics

Expected Behavior

Failure Modes

Adaptive Chunk Resizing

Disk Spill Strategy

Deployment Checklist

Troubleshooting

High Memory Usage

Slow Processing

Model Quality Degradation

Best Practices

Related Documentation