Persistence

Kora provides a comprehensive persistence layer that combines write-ahead logging for crash recovery, point-in-time snapshots for fast restarts, and tiered storage with LZ4 compression for cold data.

Overview

From kora-storage/src/lib.rs:1-43:

Tiered persistence engine for Kōra. This crate provides the durable storage stack that sits beneath Kōra’s in-memory shard engine. It implements a two-tier hierarchy — hot RAM and cold LZ4-compressed disk — with write-ahead logging and point-in-time snapshots for crash recovery.

Architecture Layers

WAL

Write-Ahead LogAppend-only log with CRC-32C integrity. Every mutation logged before acknowledgement.

RDB

SnapshotsCompact binary format for complete database images. Atomic writes with CRC verification.

Cold Tier

LZ4 BackendCompressed storage for evicted data with in-memory hash index.

Write-Ahead Log (WAL)

WAL Architecture

From kora-storage/src/wal.rs:1-17:

Wire format (per entry):
[len: u32][type: u8][payload...][crc32: u32]

- len: total byte length of type + payload (excludes len and crc fields)
- type: discriminant of WalEntry
- payload: entry-specific data
- crc32: CRC-32C of type + payload

Supported Operations

From kora-storage/src/wal.rs:36-89:

pub enum WalEntry {
    Set { key, value, ttl_ms },
    Del { key },
    Expire { key, ttl_ms },
    LPush { key, values },
    RPush { key, values },
    HSet { key, fields },
    SAdd { key, members },
    FlushDb,
}

Sync Policies

From kora-storage/src/wal.rs:24-33:

EveryWrite
EverySecond
OsManaged

SyncPolicy::EveryWrite

Fsync after every write

Durability: Maximum (no data loss)
Performance: Slowest (~1000 writes/sec on SSD)
Use case: Financial transactions, critical data

SyncPolicy::EverySecond

Fsync once per second

Durability: Good (≤1 second data loss)
Performance: Balanced (~50K writes/sec)
Use case: General-purpose caching (default)

SyncPolicy::OsManaged

Let OS decide when to flush

Durability: Lowest (seconds to minutes data loss)
Performance: Fastest (~100K+ writes/sec)
Use case: Ephemeral data, can tolerate loss

WAL Operations

// Open or create WAL
let mut wal = WriteAheadLog::open("kora.wal", SyncPolicy::EverySecond)?;

// Append entries
wal.append(&WalEntry::Set {
    key: b"user:1000".to_vec(),
    value: b"Alice".to_vec(),
    ttl_ms: None,
})?;

// Manual sync (for EverySecond/OsManaged)
wal.sync()?;

// Replay on restart
WriteAheadLog::replay("kora.wal", |entry| {
    match entry {
        WalEntry::Set { key, value, ttl_ms } => {
            // Restore key-value
        }
        WalEntry::Del { key } => {
            // Remove key
        }
        // ...
    }
})?;

WAL Rotation

WAL files are rotated when they exceed a size threshold or after snapshots:

// Check WAL size
if wal.bytes_written() > max_wal_size {
    // Trigger snapshot
    create_snapshot()?;
    
    // Truncate WAL
    wal = WriteAheadLog::open("kora.wal", policy)?;
}

After a successful RDB snapshot, the WAL can be truncated since the snapshot contains all data up to that point.

RDB Snapshots

RDB Format

From kora-storage/src/rdb.rs:7-19:

File format:
[magic: 8 bytes "KORA_RDB"][version: u32][num_entries: u64]
[entry]*
[checksum: u32 CRC-32C of everything before this]

Each entry:
[key_len: u32][key_bytes][ttl_flag: u8][ttl_ms: u64 if flag=1]
[value_type: u8][value_data...]

Supported Types

From kora-storage/src/rdb.rs:51-64:

pub enum RdbValue {
    String(Vec<u8>),              // Byte string
    Int(i64),                     // Integer
    List(Vec<Vec<u8>>),           // List of byte strings
    Set(Vec<Vec<u8>>),            // Set of byte strings
    Hash(Vec<(Vec<u8>, Vec<u8>)>), // Hash field→value pairs
}

Creating Snapshots

use kora_storage::rdb::{save, load, RdbEntry, RdbValue};

// Collect all entries
let entries: Vec<RdbEntry> = shard_store
    .entries_iter()
    .map(|(key, entry)| RdbEntry {
        key: key.as_bytes().to_vec(),
        value: match &entry.value {
            Value::HeapStr(s) => RdbValue::String(s.to_vec()),
            Value::Int(i) => RdbValue::Int(*i),
            Value::List(l) => RdbValue::List(l.clone()),
            // ...
        },
        ttl_ms: entry.ttl.map(|instant| /* compute remaining TTL */),
    })
    .collect();

// Save to disk (atomic write)
save("dump.rdb", &entries)?;

Loading Snapshots

// Load RDB file
let entries = load("dump.rdb")?;

// Restore to shard
for entry in entries {
    let value = match entry.value {
        RdbValue::String(s) => Value::from_raw_bytes(&s),
        RdbValue::Int(i) => Value::Int(i),
        // ...
    };
    
    let mut key_entry = KeyEntry::new(key, value);
    if let Some(ttl_ms) = entry.ttl_ms {
        key_entry.set_ttl(Duration::from_millis(ttl_ms));
    }
    
    shard_store.insert_entry(key, key_entry);
}

Atomic Writes

From kora-storage/src/rdb.rs:67-98:

pub fn save(path: impl AsRef<Path>, entries: &[RdbEntry]) -> Result<()> {
    let tmp_path = path.with_extension("rdb.tmp");
    
    // Write to temporary file
    let file = File::create(&tmp_path)?;
    let mut writer = BufWriter::new(file);
    
    // ... write data with CRC ...
    
    writer.flush()?;
    writer.get_ref().sync_all()?;  // Fsync
    
    // Atomic rename
    fs::rename(&tmp_path, path)?;
}

Why atomic writes?

The temp-file-then-rename pattern ensures:

No partial snapshots: Crash during write leaves old snapshot intact
No corruption: CRC verified before rename
Atomic replacement: Rename is atomic on POSIX systems

This guarantees you always have a valid snapshot to recover from.

LZ4 Compression

From kora-storage/src/compressor.rs, cold-tier storage uses LZ4 compression:

Compression

use kora_storage::compressor::{compress, decompress};

// Compress data
let data = b"This is my value that will be compressed";
let compressed = compress(data)?;

// Compression ratio
let ratio = compressed.len() as f64 / data.len() as f64;
println!("Compressed to {:.1}% of original", ratio * 100.0);

// Decompress
let decompressed = decompress(&compressed)?;
assert_eq!(data, &decompressed[..]);

Benefits

Space savings

Typical compression ratios for cache data:

JSON: 60-80% reduction
Text: 50-70% reduction
Numbers: 30-50% reduction
Binary: 10-30% reduction

Fast compression

LZ4 benchmarks:

Compression: ~400 MB/s per core
Decompression: ~2 GB/s per core
Minimal CPU overhead compared to I/O

Adaptive

Compression is applied per-value:

Small values (< 128 bytes): stored uncompressed
Large values: compressed automatically
Incompressible data: detected and stored raw

Storage Manager

From kora-storage/src/manager.rs, the StorageManager coordinates all persistence layers:

use kora_storage::manager::StorageManager;

let manager = StorageManager::new(
    "data_dir",
    SyncPolicy::EverySecond,
)?;

// Write path: WAL + memory
manager.log_set(key, value, ttl)?;

// Background snapshot
manager.snapshot()?;

// Load on restart
manager.restore(|entry| {
    // Restore entry to shard
});

Per-Shard Storage

From kora-storage/src/shard_storage.rs, each shard has isolated WAL and RDB:

data/
├── shard-0/
│   ├── kora.wal
│   └── dump.rdb
├── shard-1/
│   ├── kora.wal
│   └── dump.rdb
├── shard-2/
│   ├── kora.wal
│   └── dump.rdb
└── ...

Benefits:

Parallel I/O (no contention between shards)
Independent snapshots (per-shard recovery)
Easier debugging (inspect individual shard files)

Configuration

Server Config (TOML)

[storage]
data_dir = "./data"
sync_policy = "EverySecond"  # EveryWrite | EverySecond | OsManaged
max_wal_size = 67108864      # 64 MB
snapshot_interval = 3600     # 1 hour

[compression]
profile = "balanced"         # none | fast | balanced | high
min_size = 128              # Don't compress values < 128 bytes

Programmatic Config

use kora_storage::{
    manager::StorageManager,
    wal::SyncPolicy,
};

let manager = StorageManager::builder()
    .data_dir("./kora-data")
    .sync_policy(SyncPolicy::EverySecond)
    .max_wal_size(64 * 1024 * 1024)  // 64 MB
    .snapshot_interval(Duration::from_secs(3600))
    .build()?;

Recovery Flow

Load RDB Snapshot

If dump.rdb exists, load all entries into memory. This is the fastest path to recovery.

Replay WAL

Replay kora.wal to apply any mutations that occurred after the snapshot. Corrupt entries are skipped with warnings.

Start Accepting Traffic

Once both RDB and WAL are loaded, the shard is ready to serve requests.

Background Snapshot

Schedule the next snapshot based on snapshot_interval or max_wal_size.

Crash Recovery Example

fn recover_shard(shard_id: u16, data_dir: &Path) -> Result<ShardStore> {
    let mut store = ShardStore::new(shard_id);
    let rdb_path = data_dir.join(format!("shard-{}/dump.rdb", shard_id));
    let wal_path = data_dir.join(format!("shard-{}/kora.wal", shard_id));
    
    // 1. Load snapshot
    if rdb_path.exists() {
        let entries = rdb::load(&rdb_path)?;
        for entry in entries {
            store.insert_entry(/* ... */);
        }
    }
    
    // 2. Replay WAL
    if wal_path.exists() {
        WriteAheadLog::replay(&wal_path, |entry| {
            match entry {
                WalEntry::Set { key, value, ttl_ms } => {
                    store.set_bytes(&key, &value, /* ... */);
                }
                // Handle other operations
                _ => {}
            }
        })?;
    }
    
    Ok(store)
}

Performance Tuning

Sync policy based on workload

# Session cache (ephemeral)
sync_policy = "OsManaged"

# User profiles (can lose 1 sec)
sync_policy = "EverySecond"

# Financial data (zero loss)
sync_policy = "EveryWrite"

Snapshot frequency vs WAL size

# Small WAL, frequent snapshots (lower recovery time)
max_wal_size = 10485760       # 10 MB
snapshot_interval = 300       # 5 minutes

# Large WAL, infrequent snapshots (lower I/O overhead)
max_wal_size = 1073741824     # 1 GB
snapshot_interval = 3600      # 1 hour

Compression for cold data

# Fast compression (low CPU, less savings)
[compression]
profile = "fast"

# Balanced (default)
[compression]
profile = "balanced"

# Maximum compression (high CPU, max savings)
[compression]
profile = "high"

I/O backend selection

#[cfg(target_os = "linux")]
use kora_storage::uring_backend::UringBackend;

// io_uring on Linux (if available)
let backend = UringBackend::new()?;

// Fallback to standard file I/O
let backend = FileBackend::new();

Monitoring

Key Metrics

// WAL metrics
let wal_size = wal.bytes_written();
let wal_entries = wal.entry_count();

// RDB metrics
let snapshot_size = fs::metadata("dump.rdb")?.len();
let snapshot_age = SystemTime::now().duration_since(snapshot_mtime)?;

// Compression metrics
let compression_ratio = compressed_bytes as f64 / original_bytes as f64;

Health Checks

WAL size within limit

Alert if WAL exceeds max_wal_size without snapshot rotation.

Recent snapshot exists

Alert if last snapshot is older than 2× snapshot_interval.

CRC validation passes

Monitor WAL replay errors during startup.

Disk space available

Ensure data_dir has space for WAL + RDB + headroom.

Next Steps

Configuration

Complete storage configuration reference

Operations

Backup, restore, and disaster recovery

Benchmarks

Persistence performance metrics

Architecture

Deep dive into storage internals

Getting Started

Core Concepts

Features

Operations

Development

​Overview

​Architecture Layers

WAL

RDB

Cold Tier

​Write-Ahead Log (WAL)

​WAL Architecture

​Supported Operations

​Sync Policies

​WAL Operations

​WAL Rotation

​RDB Snapshots

​RDB Format

​Supported Types

​Creating Snapshots

​Loading Snapshots

​Atomic Writes

​LZ4 Compression

​Compression

​Benefits

​Storage Manager

​Per-Shard Storage

​Configuration

​Server Config (TOML)

​Programmatic Config

​Recovery Flow

​Crash Recovery Example

​Performance Tuning

​Monitoring

​Key Metrics

​Health Checks

​Next Steps

Configuration

Operations

Benchmarks

Architecture

Build docs developers (and LLMs) love

Overview

Architecture Layers

Write-Ahead Log (WAL)

WAL Architecture

Supported Operations

Sync Policies

WAL Operations

WAL Rotation

RDB Snapshots

RDB Format

Supported Types

Creating Snapshots

Loading Snapshots

Atomic Writes

LZ4 Compression

Compression

Benefits

Storage Manager

Per-Shard Storage

Configuration

Server Config (TOML)

Programmatic Config

Recovery Flow

Crash Recovery Example

Performance Tuning

Monitoring

Key Metrics

Health Checks

Next Steps