System Architecture
Jolt is structured as a multi-crate Rust workspace, with each crate handling a specific aspect of the proving system.Crate Structure
jolt-core
Core proving system containing:host/: Guest ELF compilation and program analysis (feature-gated behindhost)zkvm/: Jolt PIOP (Polynomial Interactive Oracle Proof) - prover, verifier, R1CS/Spartan, memory checking, instruction lookupspoly/: Polynomial types, commitment schemes (Dory, HyperKZG), opening proofsfield/:JoltFieldtrait and BN254 scalar field implementationsubprotocols/: Sumcheck (batched, streaming, univariate skip), booleanity checks, BlindFold ZK protocolmsm/: Multi-scalar multiplicationtranscripts/: Fiat-Shamir transcripts (Blake2b, Keccak)
tracer
RISC-V emulator that executes guest programs and produces execution traces. Each instruction generates aCycle struct containing:
- Program counter (PC)
- Instruction opcode and flags
- Register read/write operations
- Memory read/write operations
- Intermediate computation values
jolt-sdk
Provides the#[jolt::provable] macro that automatically generates:
prove()- Generate a proof for the functionverify()- Verify a proofanalyze()- Analyze execution without proving (for debugging)preprocess()- Generate proving/verifying keys
jolt-inlines
Optimized cryptographic primitives that replace guest-side computation with efficient constraint-native implementations:- SHA-2, Blake3, Keccak
- BigInt arithmetic
- secp256k1 elliptic curve operations
common
Shared constants and types:XLEN,REGISTER_COUNT- Architecture parametersJoltDevice- I/O device abstractionMemoryLayout- Memory map configuration
Type Parameters
Most core types are generic over three parameters:The 7-Stage Prover Pipeline
Jolt’s prover operates in seven sequential stages. Each stage builds on the previous one, progressively reducing the proof burden until only polynomial openings remain.Stage 1: Trace Execution
Input: Guest program ELF binary Process:- Load the RISC-V ELF into the tracer emulator
- Execute every instruction, recording full CPU state
- Capture all memory and register accesses
- Collect program inputs/outputs via
JoltDevice
Vec<Cycle>- Complete execution traceJoltDevice- I/O state
Stage 2: Witness Generation
Input: Execution trace from Stage 1 Process:- Transform trace into polynomial representations
- Generate committed witness polynomials:
RdInc- Register destination incrementsRamInc- RAM address incrementsInstructionRa(d)- Instruction lookup indicesBytecodeRa(d)- Bytecode lookup indicesRamRa(d)- RAM lookup indicesTrustedAdvice- Public auxiliary dataUntrustedAdvice- Private auxiliary data
- Prepare virtual polynomials (computed on-demand during proving):
- PC, register values, RAM values
- Instruction flags
- Lookup operands and outputs
CompactPolynomial stores small integers (u8-i128) instead of full field elements, reducing memory usage by 8-32x.
Stage 3: Streaming Commitment
Input: Witness polynomials from Stage 2 Process:- Split large polynomials into chunks (tier-1)
- Commit to each chunk using Dory polynomial commitments
- Aggregate chunk commitments (tier-2)
- Generate final commitment for each witness polynomial
- Polynomial commitments sent to verifier
- Commitment metadata cached for opening proofs
Stage 4: Spartan R1CS
Input: Committed witness polynomials Process:- Build R1CS (Rank-1 Constraint System) encoding instruction constraints
- Run Spartan protocol to prove R1CS satisfaction:
- Outer sumcheck: Reduces R1CS check to product sumcheck
- Product virtual sumcheck: Proves Az ∘ Bz = Cz
- Univariate skip: Optimized sumcheck for sparse polynomials
- Apply shift arguments for temporal constraints
- Prove instruction input constraints
Stage 5: Sumcheck Rounds
Input: Spartan proof from Stage 4 Process: Run batched sumcheck protocols for:- Instruction lookups (RA virtual sumcheck)
- Prove each instruction’s operands/results match lookup table entries
- Bytecode checking
- Prove PC values correspond to correct instruction opcodes
- RAM read-write checking
- Prove all RAM reads return previously written values
- Verify initialization values
- Check final memory state
- Register read-write checking
- Prove register reads return correct values
- Verify register updates
- Hamming booleanity
- Prove Hamming weight bits are actually 0 or 1
- Claim reductions
- Reduce all sumcheck claims to polynomial openings
Stage 6: Opening Proofs
Input: Opening claims from Stage 5 Process:- Collect all polynomial opening requests into
ProverOpeningAccumulator - Batch openings for the same polynomial at different points
- Generate Dory opening proofs proving:
- Commitment C commits to polynomial P
- P evaluates to y at point x
- In ZK mode: Use committed evaluations (y_com) instead of revealing y
Stage 7: BlindFold Zero-Knowledge (Optional)
Input: All sumcheck proofs from previous stages Process (only in ZK mode):- Build
VerifierR1CSencoding all sumcheck verification checks - Construct witness assignment from sumcheck round polynomials
- Nova folding:
- Fold real R1CS instance with random satisfying instance
- Hides witness values via randomization
- Spartan proof over folded R1CS:
- Outer sumcheck proves relaxed R1CS satisfaction
- Inner sumcheck reduces to single witness evaluation
- Hyrax-style openings:
- Prove W(ry) and E(rx) against folded row commitments
BlindFoldProof (replaces cleartext opening claims)
Why BlindFold? Standard mode reveals sumcheck round polynomials, leaking witness information. BlindFold proves the verifier would accept without revealing these values.
Data Flow
Key Invariants
Polynomial Consistency
All parties must agree on polynomial encodings:- Trace index i maps to evaluation point (binary representation of i)
- Multilinear extension is unique for a given basis
- Commitment bindings are deterministic
Sumcheck Claim Synchronization
Every sumcheck instance must maintain synchronized claim computation and constraints:input_claim()- Computes claim value from openingsinput_claim_constraint()- Encodes same formula for BlindFold R1CSoutput_claim_constraint()- Encodes output claim check
Transcript Consistency
Prover and verifier must maintain identical Fiat-Shamir transcript state:- Same message ordering
- Same challenge derivation
- In ZK mode: Opening claims excluded from transcript
Performance Characteristics
Hot Paths
Proving time is dominated by:- Sumcheck inner loop (60-70% of time)
- Polynomial bind operations
- Evaluation at random points
- Equality polynomial evaluation
- Polynomial commitments (20-30% of time)
- Multi-scalar multiplication
- Dory tier aggregation
- Memory checking (5-10% of time)
- RAM/register read-write verification
Scaling
| Component | Prover Time | Proof Size | Verifier Time |
|---|---|---|---|
| Trace execution | O(T) | - | - |
| Commitments | O(T) | O(log T) | O(log T) |
| Sumcheck | O(T log T) | O(log² T) | O(log² T) |
| Opening proofs | O(N log T) | O(N log T) | O(N log T) |
| BlindFold | O(S²) | O(S log S) | O(S log S) |
- T = trace length (number of cycles)
- N = number of distinct polynomials
- S = number of sumcheck rounds (≈ log T per instance)
Memory Usage
Memory optimization strategies:CompactPolynomial: Store small integers instead of field elementsSharedRaPolynomials: Share equality tables across multiple polynomials- Streaming commitment: Process chunks without materializing full dense polynomial
- Lazy materialization:
RaPolynomialcomputes values on-demand
Feature Flags
Feature Hierarchy
minimal: Core types, verifier onlyprover: Adds proving capabilityhost: Adds ELF compilation, trace analysis (requires standard library)zk: Enable BlindFold zero-knowledge modeallocative: Memory profiling via allocative cratemonitor: CPU/memory monitoring during profiling