The Problem with Single-Connection Downloads
Most browsers open one HTTP connection per download. Servers typically limit the bandwidth allocated to a single connection to ensure fair distribution among users.If a server limits each connection to 5 MB/s, a single-connection download will never exceed that speed, even if your network supports 100 MB/s.
- Opening up to 32 parallel connections to the same server
- Splitting the file into chunks and downloading them concurrently
- Dynamically optimizing these connections during the download
Parallel Connections (Up to 32)
Surge uses HTTP range requests to download different parts of a file simultaneously.How It Works
- Probe the server to check if it supports range requests (
Accept-Ranges: bytes) - Calculate optimal worker count using a square root heuristic
- Split the file into large chunks (fileSize / numWorkers)
- Spawn workers (goroutines) that download their assigned chunks in parallel
- 100 MB file → √100 = 10 connections
- 1 GB file → √1024 ≈ 32 connections (capped at max)
- 10 MB file → √10 ≈ 3 connections
The square root heuristic balances connection overhead with parallelism. Too many connections create overhead; too few leave bandwidth unused.
Large Chunks Strategy
Unlike some download managers that use tiny chunks (e.g., 1 MB), Surge creates large initial chunks:- Reduces per-request overhead (fewer HTTP handshakes)
- Minimizes file I/O operations
- Enables effective work stealing (see below)
Work Stealing
Near the end of a download, fast workers finish early while slow workers continue. Surge reallocates idle workers to steal work from busy ones.The Problem
Imagine a download with 10 workers:- Workers 1-9 finish in 30 seconds
- Worker 10 is stuck on a slow connection, taking 2 minutes
- Result: 9 idle workers while 1 struggles
Surge’s Solution
- Worker A has 20 MB remaining
- Worker B is idle
- Surge splits Worker A’s task: A keeps first 10 MB, B steals last 10 MB
- Both workers now work in parallel
How does the split avoid conflicts?
How does the split avoid conflicts?
Workers use atomic operations to update their This ensures workers never write the same bytes twice.
StopAt boundary:When Work Stealing Happens
Surge runs a balancer goroutine every 200 milliseconds:Work stealing only occurs when chunks are large enough to split (minimum 1 MB). If chunks are too small, Surge uses hedge requests instead (see below).
Health Checks for Slow Workers
Not all HTTP connections are equal. CDNs, load balancers, and network routing can cause some connections to be significantly slower than others.Detection Algorithm
Surge monitors each worker’s speed and compares it to the mean:- Worker speed < 0.3× mean speed → Cancel and restart
- No data received for 15 seconds → Cancel as stalled
Health checks run every 5 seconds but only after a grace period (first 3 seconds) to allow TCP slow-start to stabilize.
What Happens to Cancelled Workers?
When a slow worker is cancelled:- Remaining bytes are calculated from
CurrentOffset - Task is re-queued with updated offset
- Next idle worker picks up the task
- New connection may get a faster route to the server
Hedge Requests
When work stealing isn’t possible (chunks too small), Surge uses hedged requests: duplicate tasks on fresh connections.The Problem
Toward the end of a download:- Remaining chunks are too small to split (e.g., 500 KB)
- Work stealing doesn’t help
- One slow worker blocks completion
Surge’s Solution
- Original worker (Worker A) continues downloading bytes 900-1000 MB
- Idle worker (Worker B) starts a new connection for the same bytes
- Both workers race; whichever finishes first wins
- Progress is deduplicated using
SharedMaxOffsetatomic counter
Deduplication
Both workers write to the same file offsets, but Surge ensures bytes are only counted once:Hedge requests are inspired by Google’s “The Tail at Scale” paper. By racing redundant requests on fresh connections, we eliminate tail latency from slow connections.
Won't this waste bandwidth?
Won't this waste bandwidth?
Minimally. Hedge requests only trigger:
- Near the end of the download (>90% complete)
- When idle workers exist (no cost if all workers busy)
- Only once per task (marked with
Hedgedflag)
Multiple Mirror Support
Surge can download from multiple sources simultaneously, distributing workers across all available mirrors.Adding Mirrors
Worker Distribution
Workers are assigned to mirrors using round-robin:- Workers 0, 3, 6, 9 → Mirror 1
- Workers 1, 4, 7, 10 → Mirror 2
- Workers 2, 5, 8, 11 → Mirror 3
Automatic Failover
If a worker fails on one mirror, it automatically switches to the next:- Redundancy: If one mirror goes down, others continue
- Speed: Aggregate bandwidth from multiple sources
- Load balancing: Distributes load across CDN endpoints
Surge probes all mirrors before starting the download. Invalid mirrors are automatically filtered out.
Sequential Download (Streaming Mode)
For media files, Surge offers a sequential download option that preserves strict byte order.Enabling Sequential Mode
How It Works
- Small chunks (2 MB) instead of large shards
- Strict queue order (FIFO task queue)
- No work stealing across chunk boundaries
- Still parallel (multiple workers, but ordered completion)
Summary: Why Surge is Fast
Parallel Connections
Up to 32 simultaneous connections aggregate bandwidth from per-connection limits
Work Stealing
Idle workers steal tasks from slow workers, eliminating bottlenecks
Health Checks
Slow connections are cancelled and retried, ensuring all workers are fast
Hedge Requests
Race duplicate tasks on fresh connections to eliminate tail latency
Multiple Mirrors
Distribute workers across mirrors for redundancy and aggregated bandwidth
Large Chunks
Minimize overhead with large initial chunks, split dynamically as needed
Benchmarks
These optimizations result in significant speedups:| Tool | Time | Speed | vs Surge |
|---|---|---|---|
| Surge | 28.93s | 35.40 MB/s | — |
| aria2c | 40.04s | 25.57 MB/s | 1.38× slower |
| curl | 57.57s | 17.79 MB/s | 1.99× slower |
| wget | 61.81s | 16.57 MB/s | 2.14× slower |
Test conditions: 1 GB file, Windows 11, Ryzen 5 5600X, 360 Mbps network. Results averaged over 5 runs.