Skip to main content

System Architecture

Jolt is structured as a multi-crate Rust workspace, with each crate handling a specific aspect of the proving system.

Crate Structure

jolt-core

Core proving system containing:
  • host/: Guest ELF compilation and program analysis (feature-gated behind host)
  • zkvm/: Jolt PIOP (Polynomial Interactive Oracle Proof) - prover, verifier, R1CS/Spartan, memory checking, instruction lookups
  • poly/: Polynomial types, commitment schemes (Dory, HyperKZG), opening proofs
  • field/: JoltField trait and BN254 scalar field implementation
  • subprotocols/: Sumcheck (batched, streaming, univariate skip), booleanity checks, BlindFold ZK protocol
  • msm/: Multi-scalar multiplication
  • transcripts/: Fiat-Shamir transcripts (Blake2b, Keccak)

tracer

RISC-V emulator that executes guest programs and produces execution traces. Each instruction generates a Cycle struct containing:
  • Program counter (PC)
  • Instruction opcode and flags
  • Register read/write operations
  • Memory read/write operations
  • Intermediate computation values

jolt-sdk

Provides the #[jolt::provable] macro that automatically generates:
  • prove() - Generate a proof for the function
  • verify() - Verify a proof
  • analyze() - Analyze execution without proving (for debugging)
  • preprocess() - Generate proving/verifying keys

jolt-inlines

Optimized cryptographic primitives that replace guest-side computation with efficient constraint-native implementations:
  • SHA-2, Blake3, Keccak
  • BigInt arithmetic
  • secp256k1 elliptic curve operations

common

Shared constants and types:
  • XLEN, REGISTER_COUNT - Architecture parameters
  • JoltDevice - I/O device abstraction
  • MemoryLayout - Memory map configuration

Type Parameters

Most core types are generic over three parameters:
F: JoltField                       // Scalar field (BN254 Fr)
PCS: CommitmentScheme<Field = F>   // Polynomial commitment (DoryCommitmentScheme)
ProofTranscript: Transcript        // Fiat-Shamir transcript (Blake2bTranscript)
This abstraction allows swapping commitment schemes or field implementations if needed.

The 7-Stage Prover Pipeline

Jolt’s prover operates in seven sequential stages. Each stage builds on the previous one, progressively reducing the proof burden until only polynomial openings remain.

Stage 1: Trace Execution

Input: Guest program ELF binary Process:
  1. Load the RISC-V ELF into the tracer emulator
  2. Execute every instruction, recording full CPU state
  3. Capture all memory and register accesses
  4. Collect program inputs/outputs via JoltDevice
Output:
  • Vec<Cycle> - Complete execution trace
  • JoltDevice - I/O state
Key insight: The tracer is a simple RISC-V emulator, not a constraint system. It just records what happened.

Stage 2: Witness Generation

Input: Execution trace from Stage 1 Process:
  1. Transform trace into polynomial representations
  2. Generate committed witness polynomials:
    • RdInc - Register destination increments
    • RamInc - RAM address increments
    • InstructionRa(d) - Instruction lookup indices
    • BytecodeRa(d) - Bytecode lookup indices
    • RamRa(d) - RAM lookup indices
    • TrustedAdvice - Public auxiliary data
    • UntrustedAdvice - Private auxiliary data
  3. Prepare virtual polynomials (computed on-demand during proving):
    • PC, register values, RAM values
    • Instruction flags
    • Lookup operands and outputs
Output: Witness polynomials ready for commitment Memory optimization: CompactPolynomial stores small integers (u8-i128) instead of full field elements, reducing memory usage by 8-32x.

Stage 3: Streaming Commitment

Input: Witness polynomials from Stage 2 Process:
  1. Split large polynomials into chunks (tier-1)
  2. Commit to each chunk using Dory polynomial commitments
  3. Aggregate chunk commitments (tier-2)
  4. Generate final commitment for each witness polynomial
Output:
  • Polynomial commitments sent to verifier
  • Commitment metadata cached for opening proofs
Why streaming? Large traces (millions of cycles) produce polynomials too big to fit in memory as dense field elements. Streaming commits to chunks without materializing the full polynomial.

Stage 4: Spartan R1CS

Input: Committed witness polynomials Process:
  1. Build R1CS (Rank-1 Constraint System) encoding instruction constraints
  2. Run Spartan protocol to prove R1CS satisfaction:
    • Outer sumcheck: Reduces R1CS check to product sumcheck
    • Product virtual sumcheck: Proves Az ∘ Bz = Cz
    • Univariate skip: Optimized sumcheck for sparse polynomials
  3. Apply shift arguments for temporal constraints
  4. Prove instruction input constraints
Output: Spartan proof of instruction correctness Key insight: R1CS encodes the instruction semantics - “if this opcode is set, these register/memory values must satisfy this relationship.”

Stage 5: Sumcheck Rounds

Input: Spartan proof from Stage 4 Process: Run batched sumcheck protocols for:
  1. Instruction lookups (RA virtual sumcheck)
    • Prove each instruction’s operands/results match lookup table entries
  2. Bytecode checking
    • Prove PC values correspond to correct instruction opcodes
  3. RAM read-write checking
    • Prove all RAM reads return previously written values
    • Verify initialization values
    • Check final memory state
  4. Register read-write checking
    • Prove register reads return correct values
    • Verify register updates
  5. Hamming booleanity
    • Prove Hamming weight bits are actually 0 or 1
  6. Claim reductions
    • Reduce all sumcheck claims to polynomial openings
Output: Reduced claims (polynomial opening requests) Batching: Multiple sumcheck instances run together to share randomness and reduce proof size.

Stage 6: Opening Proofs

Input: Opening claims from Stage 5 Process:
  1. Collect all polynomial opening requests into ProverOpeningAccumulator
  2. Batch openings for the same polynomial at different points
  3. Generate Dory opening proofs proving:
    • Commitment C commits to polynomial P
    • P evaluates to y at point x
  4. In ZK mode: Use committed evaluations (y_com) instead of revealing y
Output: Batched opening proofs Optimization: Opening the same polynomial at multiple points shares most of the proof cost.

Stage 7: BlindFold Zero-Knowledge (Optional)

Input: All sumcheck proofs from previous stages Process (only in ZK mode):
  1. Build VerifierR1CS encoding all sumcheck verification checks
  2. Construct witness assignment from sumcheck round polynomials
  3. Nova folding:
    • Fold real R1CS instance with random satisfying instance
    • Hides witness values via randomization
  4. Spartan proof over folded R1CS:
    • Outer sumcheck proves relaxed R1CS satisfaction
    • Inner sumcheck reduces to single witness evaluation
  5. Hyrax-style openings:
    • Prove W(ry) and E(rx) against folded row commitments
Output: BlindFoldProof (replaces cleartext opening claims) Why BlindFold? Standard mode reveals sumcheck round polynomials, leaking witness information. BlindFold proves the verifier would accept without revealing these values.

Data Flow

Key Invariants

Polynomial Consistency

All parties must agree on polynomial encodings:
  • Trace index i maps to evaluation point (binary representation of i)
  • Multilinear extension is unique for a given basis
  • Commitment bindings are deterministic

Sumcheck Claim Synchronization

Every sumcheck instance must maintain synchronized claim computation and constraints:
  • input_claim() - Computes claim value from openings
  • input_claim_constraint() - Encodes same formula for BlindFold R1CS
  • output_claim_constraint() - Encodes output claim check
Mismatch between these causes BlindFold R1CS unsatisfiability.

Transcript Consistency

Prover and verifier must maintain identical Fiat-Shamir transcript state:
  • Same message ordering
  • Same challenge derivation
  • In ZK mode: Opening claims excluded from transcript

Performance Characteristics

Hot Paths

Proving time is dominated by:
  1. Sumcheck inner loop (60-70% of time)
    • Polynomial bind operations
    • Evaluation at random points
    • Equality polynomial evaluation
  2. Polynomial commitments (20-30% of time)
    • Multi-scalar multiplication
    • Dory tier aggregation
  3. Memory checking (5-10% of time)
    • RAM/register read-write verification

Scaling

ComponentProver TimeProof SizeVerifier Time
Trace executionO(T)--
CommitmentsO(T)O(log T)O(log T)
SumcheckO(T log T)O(log² T)O(log² T)
Opening proofsO(N log T)O(N log T)O(N log T)
BlindFoldO(S²)O(S log S)O(S log S)
Where:
  • T = trace length (number of cycles)
  • N = number of distinct polynomials
  • S = number of sumcheck rounds (≈ log T per instance)

Memory Usage

Memory optimization strategies:
  • CompactPolynomial: Store small integers instead of field elements
  • SharedRaPolynomials: Share equality tables across multiple polynomials
  • Streaming commitment: Process chunks without materializing full dense polynomial
  • Lazy materialization: RaPolynomial computes values on-demand

Feature Flags

Feature Hierarchy

host ⊃ prover ⊃ minimal
  • minimal: Core types, verifier only
  • prover: Adds proving capability
  • host: Adds ELF compilation, trace analysis (requires standard library)
  • zk: Enable BlindFold zero-knowledge mode
  • allocative: Memory profiling via allocative crate
  • monitor: CPU/memory monitoring during profiling

Compilation Modes

# Standard mode (faster, reveals witness)
cargo build --features host

# ZK mode (slower, hides witness)
cargo build --features host,zk
Both modes must pass all tests for a valid implementation.

Build docs developers (and LLMs) love