Skip to main content

Overview

Instead of using a traditional database or in-memory queue, nrvna-ai uses the filesystem itself as the job queue. Job directories move between subdirectories to represent state changes, leveraging POSIX atomic rename semantics for thread-safe, lock-free operation.
The filesystem queue survives crashes, requires no database, and is trivially inspectable with standard Unix tools like ls and tree.

Design Principles

Directory = Job

Each job is represented as a directory containing:
job_1736700000_12345_0/
├── prompt.txt      ← input (always present)
├── result.txt      ← output (on success)
└── error.txt       ← error message (on failure)
All job state is self-contained in the directory. You can move jobs between workspaces by simply copying directories.

Location = State

A job’s location in the directory tree is the job’s state:
WORKSPACE/
├── input/
│   ├── writing/         ← STAGING: being created
│   └── ready/           ← QUEUED: waiting for worker
├── processing/          ← RUNNING: inference in progress  
├── output/              ← DONE: completed successfully
└── failed/              ← FAILED: error occurred
No need for:
  • Database with state column
  • In-memory state tracking
  • Distributed consensus
  • Lock files or semaphores

Atomic Rename = State Transition

POSIX guarantees that rename() is atomic - the directory is never in an intermediate state:
std::filesystem::rename(
    "workspace/input/ready/job_123",
    "workspace/processing/job_123"
);
Atomicity guarantees:
  • Either the rename succeeds completely
  • Or it fails completely and the job stays in original location
  • No partial state
  • No race conditions between threads
  • No locks needed
Atomic rename only works on the same filesystem. Don’t use network filesystems (NFS, SMB) or span the workspace across mount points.

Queue Operations

Enqueue (Job Submission)

1

Create in staging area

auto staging = workspace / "input/writing" / job_id;
std::filesystem::create_directories(staging);
The writing/ directory hides incomplete jobs from workers.
2

Write job data

writeFile(staging / "prompt.txt", prompt);
Job is still invisible to the Scanner.
3

Atomically publish

auto ready = workspace / "input/ready" / job_id;
std::filesystem::rename(staging, ready);
The atomic rename makes the job visible to Scanner in one operation. Workers will never see a half-written job.

Dequeue (Job Processing)

1

Scanner discovers job

for (auto& entry : fs::directory_iterator(workspace / "input/ready")) {
    auto job_id = entry.path().filename().string();
    pool->submit(job_id);
}
Scanner finds jobs by listing ready/ directory every 1 second.
2

Worker claims job

auto ready = workspace / "input/ready" / job_id;
auto processing = workspace / "processing" / job_id;

try {
    std::filesystem::rename(ready, processing);
} catch (const std::filesystem::filesystem_error&) {
    // Another worker already claimed this job
    return;
}
The atomic rename acts as a lock. Only one worker can successfully rename the directory.
3

Process and finalize

// Read prompt
auto prompt = readFile(processing / "prompt.txt");

// Run inference
auto result = runner->run(prompt);

// Write result and move to output
writeFile(processing / "result.txt", result);
std::filesystem::rename(
    processing / job_id,
    workspace / "output" / job_id
);

Concurrency Safety

Lock-Free Dequeue

Multiple workers can try to claim the same job simultaneously:
Thread 1: rename(ready/job_X, processing/job_X)  ← succeeds
Thread 2: rename(ready/job_X, processing/job_X)  ← fails (source doesn't exist)
Thread 3: rename(ready/job_X, processing/job_X)  ← fails (source doesn't exist)
Only one thread succeeds. The others get a filesystem_error exception and move on to the next job.
No mutexes, no spinlocks, no atomic variables needed. The filesystem provides the synchronization primitive.

Race Condition Prevention

Problem: What if Scanner submits a job to Pool while a worker is already processing it? Solution: Workers attempt atomic rename from ready/ to processing/. If job is already gone, rename fails harmlessly.
void Processor::process(const JobId& job_id) {
    auto ready = workspace_ / "input/ready" / job_id;
    auto processing = workspace_ / "processing" / job_id;
    
    try {
        // This is the critical section - only one thread succeeds
        std::filesystem::rename(ready, processing);
    } catch (const std::filesystem::filesystem_error& e) {
        // Job already claimed by another worker, or doesn't exist
        LOG_DEBUG("Job already claimed: " + job_id);
        return;
    }
    
    // We own the job now, proceed with processing...
}

Scanner-Worker Coordination

No explicit coordination needed:
Time    Scanner Thread              Worker Thread
----    --------------              -------------
T0      scan ready/
T1      found: job_A, job_B
T2      submit job_A to pool
T3      submit job_B to pool        rename job_A ready→processing ✓
T4      scan ready/                 processing job_A...
T5      found: (nothing)            
T6                                  rename job_A processing→output ✓
T7      scan ready/
T8      submit job_B to pool        rename job_B ready→processing ✓
Notice:
  • Scanner may submit same job multiple times (T2 and T8 for job_B)
  • Only one worker successfully claims it (T8)
  • No locks, no coordination protocol needed

Crash Resistance

Jobs Survive Crashes

Because jobs are stored on disk:
# Server crashes mid-processing
$ ls workspace/processing/
job_1736700000_12345_0/
job_1736700001_12345_1/

# Restart server
$ nrvnad model.gguf workspace
[INFO] Recovering orphaned jobs...
[WARN] Recovered orphaned job: job_1736700000_12345_0
[WARN] Recovered orphaned job: job_1736700001_12345_1
[INFO] Server started

# Orphaned jobs moved back to ready queue
$ ls workspace/input/ready/
job_1736700000_12345_0/
job_1736700001_12345_1/
The server’s recoverOrphanedJobs() function moves jobs from processing/ back to input/ready/ on startup.

No Data Loss

Even in the worst case:
  • Jobs in writing/ - Client can retry submission
  • Jobs in ready/ - Will be processed normally
  • Jobs in processing/ - Recovered and reprocessed
  • Jobs in output/ - Results already available
  • Jobs in failed/ - Errors already recorded

Checkpoint-Free Recovery

Recovered jobs are reprocessed from scratch. Partial inference progress is not saved. For long-running jobs, consider implementing streaming results.

Queue Metrics

Queue Depth

Easily measurable by counting directories:
# Jobs waiting
$ ls workspace/input/ready/ | wc -l
42

# Jobs processing
$ ls workspace/processing/ | wc -l
4

# Jobs completed
$ ls workspace/output/ | wc -l
1337
std::size_t queueDepth() const {
    std::size_t count = 0;
    for (auto& _ : fs::directory_iterator(workspace_ / "input/ready")) {
        ++count;
    }
    return count;
}

Throughput

Measure by monitoring output directory:
# Jobs completed in last hour
$ find workspace/output/ -type d -mmin -60 | wc -l
156

# Average: 2.6 jobs/minute

Comparison to Traditional Queues

Featurenrvna (Filesystem)Redis QueueRabbitMQDatabase Queue
SetupCreate directoriesRedis serverBroker setupSchema + migrations
Crash recoveryAutomaticPersistence configDurable queuesTransaction logs
Inspectionls, catRedis CLIManagement UISQL queries
AtomicityPOSIX renameRedis transactionsAMQP confirmDB transactions
DistributionNoYesYesYes
PerformanceGood (local FS)ExcellentGoodDepends
DependenciesNoneRedisBrokerDatabase
Filesystem queues are perfect for local, single-node systems. For distributed systems, use traditional message queues.

Performance Characteristics

Scalability Limits

Directory entries: Most filesystems handle thousands of entries efficiently:
  • ext4: ~10 million files per directory
  • APFS: billions of files
  • XFS: billions of files
Practical limits:
  • 1-10K queued jobs: Excellent performance
  • 10K-100K queued jobs: Good performance
  • 100K+ queued jobs: Consider sharding or traditional queue

Scanner Performance

Directory listing is O(n) where n = number of queued jobs:
// Scanner runs every 1 second
void Server::scanLoop() {
    while (!shutdown_.load()) {
        auto jobs = scanner_->scan();  // O(n) directory listing
        for (const auto& job_id : jobs) {
            pool_->submit(job_id);
        }
        std::this_thread::sleep_for(std::chrono::seconds(1));
    }
}
Optimization: For high-volume scenarios, consider:
  • Sharding by job ID prefix
  • Using filesystem notification APIs (inotify, FSEvents)
  • Reducing scan interval when queue is empty

Filesystem Recommendations

Best: Local SSD

NVMe or SATA SSD on local machine. Fast atomic operations.

Good: Local HDD

Traditional spinning disk. Slower but still atomic.

Avoid: Network FS

NFS, SMB, etc. May not guarantee atomic renames.

Avoid: Distributed FS

GlusterFS, CephFS. Atomic semantics vary.

Directory Structure Best Practices

Cleanup Strategy

The queue grows indefinitely in output/ and failed/ directories:
# Manual cleanup of old jobs
find workspace/output/ -type d -mtime +7 -exec rm -rf {} \;
find workspace/failed/ -type d -mtime +7 -exec rm -rf {} \;
Implement a cleanup cron job or background thread to prevent unbounded disk growth.

Archival

For long-term storage:
# Archive completed jobs
tar -czf archive_$(date +%Y%m%d).tar.gz workspace/output/
rm -rf workspace/output/*

Monitoring Disk Usage

# Check workspace size
du -sh workspace/

# Breakdown by directory
du -sh workspace/*/

Implementation Details

Directory Creation

bool Server::createWorkspace() noexcept {
    namespace fs = std::filesystem;
    
    try {
        fs::create_directories(workspace_ / "input/writing");
        fs::create_directories(workspace_ / "input/ready");
        fs::create_directories(workspace_ / "processing");
        fs::create_directories(workspace_ / "output");
        fs::create_directories(workspace_ / "failed");
        return true;
    } catch (const fs::filesystem_error& e) {
        LOG_ERROR("Failed to create workspace: " + std::string(e.what()));
        return false;
    }
}

Job Directory Validation

bool isValidJobDirectory(const fs::path& job_dir) {
    // Must be a directory
    if (!fs::is_directory(job_dir)) return false;
    
    // Must contain prompt.txt
    if (!fs::exists(job_dir / "prompt.txt")) return false;
    
    // Prompt must be non-empty
    auto size = fs::file_size(job_dir / "prompt.txt");
    if (size == 0) return false;
    
    return true;
}

See Also

Job Lifecycle

State machine and transitions

Threading Model

Concurrency and worker threads

Architecture

Overall system design

Work API

Job submission API

Build docs developers (and LLMs) love