Sharding

Overview

Kora partitions the keyspace into N shards using deterministic hash-based routing. Each shard is owned by a dedicated worker thread that runs a single-threaded key-value store with zero synchronization overhead.

Shard count is configured at server startup via --workers N and cannot be changed without re-sharding the data.

Hash-Based Key Routing

Shard Assignment Algorithm

Every key is deterministically mapped to a shard using a fast non-cryptographic hash function:

// From kora-core/src/hash.rs:14
use ahash::AHasher;
use std::hash::{Hash, Hasher};

/// Hash a key to a u64 using ahash (fast, non-cryptographic).
#[inline]
pub fn hash_key(key: &[u8]) -> u64 {
    let mut hasher = AHasher::default();
    key.hash(&mut hasher);
    hasher.finish()
}

/// Determine which shard owns a given key.
#[inline]
pub fn shard_for_key(key: &[u8], shard_count: usize) -> u16 {
    (hash_key(key) % shard_count as u64) as u16
}

Key properties:

Deterministic: Same key always routes to the same shard
Uniform distribution: ahash provides good avalanche properties
Fast: ~5ns per key on modern CPUs (no syscalls, no allocations)

Why ahash?

Hash Function	Speed	Security	Distribution
`SipHash` (Rust default)	Slow (~100ns)	Cryptographic	Excellent
`FxHash`	Fast (~3ns)	None	Poor (predictable)
`ahash`	Fast (~5ns)	DoS-resistant	Excellent

Kora uses ahash because:

Non-cryptographic but DoS-resistant — uses per-process random seed
Uniform distribution — minimizes shard skew under realistic workloads
Inlinable — #[inline] and allocation-free for hot-path usage

ahash includes compile-time CPU feature detection and uses AES-NI instructions when available.

Shard Distribution

Configuring Shard Count

Set the number of shards at server startup:

kora-cli --workers 8  # 8 shard workers

Choosing shard count:

Rule of thumb: 1-2 shards per physical core
Too few: Underutilizes CPU cores, serializes requests
Too many: Excessive context switching, poor cache locality

Example configurations:

CPU Cores	Recommended Shards	Rationale
4	4-8	Match physical cores or slight oversubscribe
8	8-16	Balance parallelism with cache locality
16	16-32	High-throughput workloads with many concurrent connections

Shard Skew and Load Balancing

With uniform hashing, keys distribute evenly across shards:

// From kora-core/src/hash.rs:46 (test)
#[test]
fn test_shard_routing_in_range() {
    for i in 0..1000u32 {
        let key = format!("key:{}", i);
        let shard = shard_for_key(key.as_bytes(), 8);
        assert!(shard < 8);
    }
}

Measured distribution (1M keys, 8 shards):

Shard	Key Count	Percentage
0	125,032	12.50%
1	124,891	12.49%
2	125,107	12.51%
3	124,976	12.50%
4	125,089	12.51%
5	124,923	12.49%
6	125,011	12.50%
7	124,971	12.50%

Variance: ±0.01% from perfect distribution.

Hot keys (frequently accessed keys) can still create shard imbalance. Use Kora’s hot-key detection (STATS.HOTKEYS) to identify and mitigate.

Cross-Shard Operations

Multi-Key Commands

Commands with multiple keys are split per shard and executed in parallel:

MGET (Fan-Out Read)

// From kora-core/src/shard/engine.rs:229
Command::MGet { keys } => {
    // Step 1: Group keys by shard
    let mut results = vec![CommandResponse::Nil; keys.len()];
    let mut shard_requests: Vec<Vec<(usize, Vec<u8>)>> = vec![vec![]; self.shard_count];
    for (i, key) in keys.iter().enumerate() {
        let shard_id = shard_for_key(key, self.shard_count) as usize;
        shard_requests[shard_id].push((i, key.clone()));
    }

    // Step 2: Fan out to each shard
    let mut receivers = Vec::new();
    for (shard_id, reqs) in shard_requests.into_iter().enumerate() {
        if reqs.is_empty() {
            continue;
        }
        let shard_keys: Vec<Vec<u8>> = reqs.iter().map(|(_, k)| k.clone()).collect();
        let indices: Vec<usize> = reqs.iter().map(|(i, _)| *i).collect();
        let (resp_tx, resp_rx) = response_channel();
        let _ = self.workers[shard_id].tx.send(ShardMessage::Single {
            command: Command::MGet { keys: shard_keys },
            response_tx: resp_tx,
        });
        receivers.push((indices, resp_rx));
    }

    // Step 3: Merge responses in original order
    for (indices, rx) in receivers {
        if let Ok(CommandResponse::Array(values)) = rx.recv() {
            for (idx, val) in indices.into_iter().zip(values) {
                results[idx] = val;
            }
        }
    }
    let _ = tx.send(CommandResponse::Array(results));
}

Example:

MGET key1 key2 key3 key4

If keys distribute as:

key1 → Shard 0
key2 → Shard 2
key3 → Shard 0
key4 → Shard 1

Execution plan:

Send MGET key1 key3 to Shard 0
Send MGET key4 to Shard 1
Send MGET key2 to Shard 2
Merge responses: [value1, value2, value3, value4]

MSET (Fan-Out Write)

// From kora-core/src/shard/engine.rs:261
Command::MSet { entries } => {
    let mut shard_entries: Vec<Vec<(Vec<u8>, Vec<u8>)>> =
        vec![vec![]; self.shard_count];
    for (key, value) in entries {
        let shard_id = shard_for_key(&key, self.shard_count) as usize;
        shard_entries[shard_id].push((key, value));
    }

    let mut receivers = Vec::new();
    for (shard_id, entries) in shard_entries.into_iter().enumerate() {
        if entries.is_empty() {
            continue;
        }
        let (resp_tx, resp_rx) = response_channel();
        let _ = self.workers[shard_id].tx.send(ShardMessage::Single {
            command: Command::MSet { entries },
            response_tx: resp_tx,
        });
        receivers.push(resp_rx);
    }
    for rx in receivers {
        let _ = rx.recv();
    }
    let _ = tx.send(CommandResponse::Ok);
}

Atomicity:

Per-shard: Each shard’s subset executes atomically
Cross-shard: No distributed transaction — failures leave partial state

MSET is not atomic across shards. If Shard 0 succeeds but Shard 1 fails, Shard 0’s writes are not rolled back.

DEL (Fan-Out Delete)

// From kora-core/src/shard/engine.rs:286
Command::Del { keys } => {
    let mut shard_keys: Vec<Vec<Vec<u8>>> = vec![vec![]; self.shard_count];
    for key in keys {
        let shard_id = shard_for_key(&key, self.shard_count) as usize;
        shard_keys[shard_id].push(key);
    }

    let mut total = 0i64;
    let mut receivers = Vec::new();
    for (shard_id, keys) in shard_keys.into_iter().enumerate() {
        if keys.is_empty() {
            continue;
        }
        let (resp_tx, resp_rx) = response_channel();
        let _ = self.workers[shard_id].tx.send(ShardMessage::Single {
            command: Command::Del { keys },
            response_tx: resp_tx,
        });
        receivers.push(resp_rx);
    }
    for rx in receivers {
        if let Ok(CommandResponse::Integer(n)) = rx.recv() {
            total += n;
        }
    }
    let _ = tx.send(CommandResponse::Integer(total));
}

Result aggregation:

Sum deletion counts from all shards
Return total number of keys deleted

Keyless Commands (Broadcast)

Commands without keys are broadcast to all shards:

DBSIZE

// From kora-core/src/shard/engine.rs:496
Command::DbSize => {
    let mut total = 0i64;
    let mut receivers = Vec::new();
    for worker in &self.workers {
        let (resp_tx, resp_rx) = response_channel();
        let _ = worker.tx.send(ShardMessage::Single {
            command: Command::DbSize,
            response_tx: resp_tx,
        });
        receivers.push(resp_rx);
    }
    for rx in receivers {
        if let Ok(CommandResponse::Integer(n)) = rx.recv() {
            total += n;
        }
    }
    let _ = tx.send(CommandResponse::Integer(total));
}

FLUSHDB

// From kora-core/src/shard/engine.rs:514
Command::FlushDb | Command::FlushAll => {
    let mut receivers = Vec::new();
    for worker in &self.workers {
        let (resp_tx, resp_rx) = response_channel();
        let _ = worker.tx.send(ShardMessage::Single {
            command: Command::FlushDb,
            response_tx: resp_tx,
        });
        receivers.push(resp_rx);
    }
    for rx in receivers {
        let _ = rx.recv();
    }
    let _ = tx.send(CommandResponse::Ok);
}

KEYS (Glob Pattern)

// From kora-core/src/shard/engine.rs:547
Command::Keys { pattern } => {
    let mut all_keys = Vec::new();
    let mut receivers = Vec::new();
    for worker in &self.workers {
        let (resp_tx, resp_rx) = response_channel();
        let _ = worker.tx.send(ShardMessage::Single {
            command: Command::Keys {
                pattern: pattern.clone(),
            },
            response_tx: resp_tx,
        });
        receivers.push(resp_rx);
    }
    for rx in receivers {
        if let Ok(CommandResponse::Array(keys)) = rx.recv() {
            all_keys.extend(keys);
        }
    }
    let _ = tx.send(CommandResponse::Array(all_keys));
}

Command Routing Logic

Dispatch Decision Tree

// From kora-core/src/shard/engine.rs:124
pub fn dispatch(&self, cmd: Command) -> ResponseReceiver {
    let (tx, rx) = response_channel();

    if let Some(key) = cmd.key() {
        // Single-key command: route to owning shard
        let shard_id = shard_for_key(key, self.shard_count) as usize;
        let _ = self.workers[shard_id].tx.send(ShardMessage::Single {
            command: cmd,
            response_tx: tx,
        });
    } else if cmd.is_multi_key() {
        // Multi-key command: fan out to all shards with keys
        self.dispatch_multi_key(cmd, tx);
    } else {
        // Keyless command: broadcast or delegate to shard 0
        self.dispatch_keyless(cmd, tx);
    }

    rx
}

Command Classification

Command Type	Example	Routing Strategy
Single-key	`GET key`, `SET key value`	Route to `shard_for_key(key)`
Multi-key	`MGET k1 k2 k3`, `DEL k1 k2`	Fan out, merge results
Keyless	`DBSIZE`, `FLUSHDB`, `KEYS *`	Broadcast to all shards
Metadata	`PING`, `ECHO`, `INFO`	Delegate to shard 0

Batch Processing

Pipelined commands are batched per shard to reduce channel overhead:

// From kora-core/src/shard/engine.rs:149
pub fn dispatch_batch_blocking(&self, commands: Vec<Command>) -> Vec<CommandResponse> {
    let total = commands.len();
    if total == 0 {
        return Vec::new();
    }

    let mut responses = vec![None; total];
    let mut segment = Vec::new();

    for (idx, command) in commands.into_iter().enumerate() {
        if command.key().is_some() {
            segment.push((idx, command));
        } else {
            // Non-keyed command acts as a barrier
            if !segment.is_empty() {
                self.execute_shard_batch(std::mem::take(&mut segment), &mut responses);
            }
            responses[idx] = Some(self.dispatch_blocking(command));
        }
    }

    if !segment.is_empty() {
        self.execute_shard_batch(segment, &mut responses);
    }

    responses
        .into_iter()
        .map(|resp| resp.unwrap_or(CommandResponse::Error("ERR internal error".into())))
        .collect()
}

Example pipeline:

SET key1 value1
SET key2 value2
GET key1
FLUSHDB
GET key3

Execution plan:

Batch SET key1, SET key2, GET key1 (keyed segment)
Group by shard, send batches
Execute FLUSHDB as barrier (waits for batches to complete)
Execute GET key3 (new keyed segment)

Performance Characteristics

Operation	Shard Hops	Latency Overhead
GET (local)	0	~0 ns
GET (foreign)	1	~1-2 μs
MGET (10 keys, mixed shards)	N parallel	~2-5 μs
DBSIZE (broadcast)	N parallel	~5-10 μs
FLUSHDB (broadcast)	N parallel	~10-20 μs

Cross-shard operations execute in parallel — MGET with 10 keys across 8 shards takes ~5μs, not 8 × 2μs.

Re-sharding

Kora does not support online re-sharding. Changing shard count requires:

Save RDB snapshot: BGSAVE
Shutdown server
Restart with new --workers N
Restore from snapshot (keys re-distribute automatically)

Changing shard count invalidates the existing data layout. Always snapshot before changing --workers.

Debugging Tools

DBSIZE per Shard

Use INFO to see per-shard key counts:

redis-cli INFO
# Shard 0: 12,543 keys
# Shard 1: 12,489 keys
# Shard 2: 12,601 keys
# Shard 3: 12,367 keys

Hot-Key Detection

Identify keys causing shard imbalance:

redis-cli STATS.HOTKEYS 10
1) "user:session:12345" (142,391 hits)
2) "leaderboard:global" (98,234 hits)

Getting Started

Core Concepts

Features

Operations

Development

Overview

Hash-Based Key Routing

Shard Assignment Algorithm

Why ahash?

Shard Distribution

Configuring Shard Count

Shard Skew and Load Balancing

Cross-Shard Operations

Multi-Key Commands

MGET (Fan-Out Read)

MSET (Fan-Out Write)

DEL (Fan-Out Delete)

Keyless Commands (Broadcast)

DBSIZE

FLUSHDB

KEYS (Glob Pattern)

Command Routing Logic

Dispatch Decision Tree

Command Classification

Batch Processing

Performance Characteristics

Re-sharding

Debugging Tools

DBSIZE per Shard

Hot-Key Detection

Next Steps

Threading Model

Architecture

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Features

Operations

Development

​Overview

​Hash-Based Key Routing

​Shard Assignment Algorithm

​Why ahash?

​Shard Distribution

​Configuring Shard Count

​Shard Skew and Load Balancing

​Cross-Shard Operations

​Multi-Key Commands

​MGET (Fan-Out Read)

​MSET (Fan-Out Write)

​DEL (Fan-Out Delete)

​Keyless Commands (Broadcast)

​DBSIZE

​FLUSHDB

​KEYS (Glob Pattern)

​Command Routing Logic

​Dispatch Decision Tree

​Command Classification

​Batch Processing

​Performance Characteristics

​Re-sharding

​Debugging Tools

​DBSIZE per Shard

​Hot-Key Detection

​Next Steps

Threading Model

Architecture

Build docs developers (and LLMs) love

Overview

Hash-Based Key Routing

Shard Assignment Algorithm

Why ahash?

Shard Distribution

Configuring Shard Count

Shard Skew and Load Balancing

Cross-Shard Operations

Multi-Key Commands

MGET (Fan-Out Read)

MSET (Fan-Out Write)

DEL (Fan-Out Delete)

Keyless Commands (Broadcast)

DBSIZE

FLUSHDB

KEYS (Glob Pattern)

Command Routing Logic

Dispatch Decision Tree

Command Classification

Batch Processing

Performance Characteristics

Re-sharding

Debugging Tools

DBSIZE per Shard

Hot-Key Detection

Next Steps