Skip to main content

Overview

The edge device deployment configuration is optimized for resource-constrained environments such as embedded systems, IoT devices, or low-power compute nodes. This mode prioritizes memory safety and resilience over maximum throughput.

Memory and Compute Constraints

Default Constraints

Edge deployments operate under strict resource limits:
  • Memory limit: 256 MB maximum
  • Compute units: 0.4 (40% of available compute capacity)
  • Chunk size: 64 rows per chunk
  • Batch size: 96 rows
  • Parallel jobs: 1 (sequential processing)

Adaptive Resource Management

The pipeline automatically adjusts to available resources:
  • Adaptive chunk resizing: Enabled by default
  • Maximum chunk retries: 3 attempts with exponential backoff
  • Spill to disk: Enabled to prevent out-of-memory failures
  • Memory monitoring: Per-chunk memory tracking with automatic downscaling

Configuration

Edge Configuration Template

Location: configs/pipeline.edge.template.json
{
  "random_seed": 42,
  "chunk_size": 64,
  "batch_size": 96,
  "n_jobs": 1,
  "max_memory_mb": 256,
  "max_compute_units": 0.4,
  "benchmark_runs": 3,
  "adaptive_chunk_resize": true,
  "max_chunk_retries": 3,
  "spill_to_disk": true,
  "output_dir": "artifacts_edge"
}

Configuration Parameters

ParameterValuePurpose
chunk_size64Small chunks reduce memory pressure
batch_size96Conservative batch size for model updates
n_jobs1Sequential processing avoids contention
max_memory_mb256Hard memory cap for safety
max_compute_units0.4Leave headroom for system processes
adaptive_chunk_resizetrueAutomatic downscaling on memory pressure
spill_to_disktruePersist intermediate results to storage

Running on Edge Devices

Basic Deployment

cd "NBA Data Preprocessing/task"
python run_pipeline.py \
  --input ../data/nba2k-full.csv \
  --config-template ../../configs/pipeline.edge.template.json

Custom Memory Limit

python run_pipeline.py \
  --input ../data/nba2k-full.csv \
  --config-template ../../configs/pipeline.edge.template.json \
  --max-memory-mb 128

Disable Disk Spill (Risk)

python run_pipeline.py \
  --input ../data/nba2k-full.csv \
  --config-template ../../configs/pipeline.edge.template.json \
  --no-spill

Performance Characteristics

Expected Behavior

  • Latency: Higher per-row latency due to small chunks and retry logic
  • Throughput: 50-200 rows/second depending on hardware
  • Memory: Stays within 256 MB limit with adaptive resizing
  • Energy: ~30J estimated per run on typical ARM devices

Failure Modes

Common edge device failures:
  • Out of memory: Occurs when adaptive_chunk_resize is disabled
  • Slow I/O: CSV ingestion becomes bottleneck on SD cards
  • RAPL unavailable: Energy counters not available; fallback estimate used
  • Model quality degradation: Aggressive downscaling can reduce R² scores

Adaptive Chunk Resizing

When memory exceeds the limit, the pipeline automatically:
  1. Detects memory pressure after processing each chunk
  2. Splits pending chunk in half (minimum 16 rows)
  3. Retries up to 3 times with exponential backoff (50ms, 100ms, 200ms)
  4. Processes smaller chunk with reduced memory footprint
From engine.py:251-258:
if memory_exceeded and self.config.adaptive_chunk_resize and retries < self.config.max_chunk_retries and len(chunk) > 16:
    retries += 1
    split = max(16, len(chunk) // 2)
    pending_chunks.insert(0, chunk.iloc[split:].copy())
    chunk = chunk.iloc[:split].copy()
    current_chunk_size = split
    time.sleep(min(0.05 * retries, 0.2))
    continue

Disk Spill Strategy

When spill_to_disk is enabled, intermediate results are persisted:
  • X features: intermediate/stream_chunk_{id}_X.csv
  • y targets: intermediate/stream_chunk_{id}_y.csv
This prevents data loss during memory pressure but increases I/O amplification.
Disk spill adds 10-50% latency overhead on slow storage (e.g., SD cards) but provides resilience against OOM failures.

Deployment Checklist

1

Verify System Requirements

  • Python 3.8+ installed
  • Required dependencies from requirements.txt
  • At least 256 MB available RAM
  • Storage space for input data + artifacts (typically 50-100 MB)
2

Test Memory Constraints

# Run with strict memory limit
python run_pipeline.py \
  --input ../data/nba2k-full.csv \
  --max-memory-mb 128 \
  --benchmark-runs 1
3

Monitor First Run

Check for:
  • Memory exceeded warnings in reports/streaming_chunks.jsonl
  • Retry count in chunk metrics
  • Spill file creation in intermediate/ directory
4

Tune Configuration

Adjust based on first run:
  • Reduce chunk_size if frequent retries occur
  • Reduce max_memory_mb if system becomes unstable
  • Disable spill_to_disk if storage is limited
5

Validate Reproducibility

# Run twice with same seed
python run_pipeline.py --input ../data/nba2k-full.csv --random-seed 42
python run_pipeline.py --input ../data/nba2k-full.csv --random-seed 42

# Compare fingerprints
diff artifacts_edge/metadata/run_manifest.json artifacts_edge_2/metadata/run_manifest.json
6

Set Up Persistent Storage

Configure external storage for long-term artifact retention:
--output-dir /mnt/external/nba_artifacts

Troubleshooting

High Memory Usage

Symptom: Memory exceeds limit frequently Solutions:
  • Reduce chunk_size to 32 or 16
  • Lower max_memory_mb to trigger adaptive resizing earlier
  • Enable spill_to_disk if disabled

Slow Processing

Symptom: Throughput below 50 rows/second Solutions:
  • Check storage I/O (disable spill_to_disk if possible)
  • Increase chunk_size slightly if memory allows
  • Verify no background processes consuming CPU

Model Quality Degradation

Symptom: R² score significantly lower than server deployment Solutions:
  • Increase max_compute_units to 0.5 or 0.6
  • Increase benchmark_runs for more stable estimates
  • Review chunk metrics for excessive retries

Best Practices

Keep adaptive_chunk_resize: true to prevent out-of-memory failures. Only disable if you have strict latency requirements and have validated memory safety.
Enable spill_to_disk: true for production deployments to ensure resilience. The latency overhead is acceptable compared to pipeline failures.
Before deploying to edge devices, run unit tests:
cd "NBA Data Preprocessing/task"
python -m unittest discover -s test -p 'test_*.py'
On ARM devices without RAPL, the pipeline uses a fallback estimate (~30J). For precise measurements, use external power meters.

Build docs developers (and LLMs) love