Skip to main content

Overview

Jolt’s instruction lookup system proves that each instruction in the execution trace produces the correct output. Instead of implementing instruction logic as R1CS constraints (which would be massive), Jolt uses a novel lookup argument that proves instruction correctness by looking up precomputed outputs in committed tables.

Role in the Proving System

Instruction lookups are central to Jolt’s efficiency:
  1. Stage: Stage 5 (Instruction Lookups)
  2. Input: Execution trace with instruction operands (RS1, RS2, immediate)
  3. Method: Lookup outputs in committed tables, prove via sumcheck
  4. Output: Opening claims for instruction flags and lookup results
  5. Benefit: Drastically smaller proof size than constraint-per-instruction approaches

Instruction Decomposition

Subtable Structure

Jolt decomposes each instruction into small subtables: Example: 64-bit XOR
  • Naive approach: 2^128 entry table (operands A, B)
  • Jolt approach: Split operands into 8-byte chunks
    • 8 subtables, each 2^16 entries (8-bit input, 8-bit output)
    • Total size: 8 · 2^16 = 524,288 entries (vs. 2^128)
General pattern:
instruction(A, B) = combine(
    subtable_0(A[0..7], B[0..7]),
    subtable_1(A[8..15], B[8..15]),
    ...
    subtable_k(A[...], B[...])
)

Virtual vs. Physical Instructions

Physical instructions:
  • Direct RISC-V opcodes (ADD, XOR, LOAD, etc.)
  • One lookup per instruction
Virtual instructions:
  • Multi-cycle operations (e.g., 256-bit multiplication)
  • Decomposed into multiple subtable lookups
  • Used by jolt-inlines for optimized cryptographic operations

Lookup Argument Protocol

Jolt uses the Twist/Shout lookup argument (modified Lasso):

High-Level Protocol

  1. Prover commits to subtables (one-time preprocessing)
  2. Trace generation: For each instruction, record:
    • Instruction flags (one-hot encoding of opcode)
    • Operands (RS1, RS2, immediate)
    • Chunk indices (operands split into subtable indices)
  3. Sumcheck: Prove that looked-up values match instruction outputs
  4. Opening proofs: Verify subtable commitments at queried indices

RA (Read-After) Polynomial

The RA polynomial encodes which table entry is accessed:
RA[instruction_id, subtable_id, entry_index] = count of lookups
In one-hot form:
RA[i] = 1 if entry i is accessed, else 0
Sumcheck claim:
Σ_{i} RA[i] · subtable_value[i] = Σ_{instruction} expected_output
This proves that looked-up values sum to expected instruction outputs.

RAF (Read-After-Final) Checking

Similar to RAM checking, instruction lookups verify RAF=0 to ensure:
  • No lookups to invalid table entries
  • All lookups reference valid subtable indices

Instruction Flags

Each cycle has instruction flags (one-hot encoding):
pub struct InstructionFlags {
    pub flag_add: bool,
    pub flag_sub: bool,
    pub flag_xor: bool,
    // ... ~50 instruction flags ...
}
Purpose:
  • Activate correct subtable lookups for each instruction
  • Used in R1CS constraints to conditionally enforce instruction semantics
  • One flag set per cycle (except no-ops)

Subtable Implementations

Lookup Table Trait

pub trait LookupTable {
    fn materialize(K: usize) -> Vec<F>;  // Generate table entries
    fn suffix_materialize() -> Vec<F>;   // Suffix table variant
    fn prefix_materialize() -> Vec<F>;   // Prefix table variant
}

Example: XOR Table

pub struct XorLookupTable;

impl LookupTable for XorLookupTable {
    fn materialize(K: usize) -> Vec<F> {
        (0..2^K).map(|i| {
            let a = (i >> 8) as u8;
            let b = (i & 0xFF) as u8;
            F::from(a ^ b)
        }).collect()
    }
}

Subtable Categories

Arithmetic: ADD, SUB, MUL, DIV Logical: AND, OR, XOR, NOT Shifts: SLL, SRL, SRA, rotate Comparisons: LT, LTE, EQ Memory: Load/store address computation Specialized: Sign extension, byte reversal, bit counting

Instruction Lookups Sumcheck (Stage 5)

RA Virtual Sumcheck

Purpose: Reduce RA polynomial claims to subtable evaluations. Sumcheck claim:
Σ_{x ∈ {0,1}^m} eq(r, x) · RA_instruction(x) = claimed_eval
where RA_instruction(x) is the RA polynomial for instruction lookups at cycle x. Polynomial structure:
  • One RA polynomial per instruction type
  • Shared across multiple subtables (prefix/suffix decomposition)
  • Lazy materialization via SharedRaPolynomials (memory optimization)

Read-RAF Checking

Purpose: Verify no invalid lookups (like RAM RAF checking). Sumcheck claim:
Σ_{subtable, index} RAF[subtable, index] · RA[subtable, index] = 0
If any lookup is invalid (RAF=1), the sumcheck fails.

Performance Characteristics

Prover Time

Subtable commitment (one-time preprocessing):
  • O(table_size · log(table_size)) field operations
  • ~1-5 seconds for all subtables (amortized across all proofs)
Lookup sumcheck (per proof):
  • O(trace_length · num_subtables · log(table_size))
  • ~15-20% of total proving time

Memory Usage

Subtable storage:
  • ~50-100MB for all RISC-V subtables
  • Shared across all proofs (one-time cost)
RA polynomial storage:
  • Uses SharedRaPolynomials to deduplicate equality tables
  • ~10-20MB for 10⁶ cycles

Proof Size

Without lookups (naive R1CS):
  • Would require ~1000 constraints per instruction type
  • Proof size: O(num_instructions · trace_length)
With lookups:
  • O(num_subtables · log(table_size))
  • ~10x smaller proof size for typical workloads

Verifier Time

  • O(num_subtables · log(trace_length))
  • ~1-3ms for typical traces
  • Verifies sumcheck round polynomials + subtable commitment openings

Comparison with Other zkVMs

ApproachJolt (Lookups)Cairo (AIR)RISC Zero (AIR)Polygon Miden (Lookups)
Instruction costO(log table_size)O(degree)O(degree)O(log table_size)
Proof sizeLogarithmicLinearLogarithmicLogarithmic
SetupSubtable commitmentNoneNoneSubtable commitment
FlexibilityHigh (custom tables)MediumMediumHigh

Optimizations

Prefix/Suffix Decomposition

Problem: Some instructions need different lookup tables for different operand chunks. Solution: Split polynomials as:
P(x) = Σ P_i(x_prefix) · Q_i(x_suffix)
Allows efficient sumcheck over decomposed operands.

Shared RA Polynomials

Problem: Multiple instructions share subtable structures (e.g., all shifts use same bit-extraction tables). Solution: SharedRaPolynomials deduplicates equality tables:
pub struct SharedRaPolynomials {
    shared_eq_tables: Vec<Vec<F>>,  // Shared across N polynomials
    ra_indices: Vec<Vec<usize>>,    // Per-polynomial index selection
}
Reduces memory from O(N · table_size) to O(table_size + N · trace_length).

Lazy RA Materialization

Problem: Materializing full RA polynomials for all instructions is expensive. Solution: RaPolynomial uses lazy state machine:
Round1 → Round2 → Round3 → RoundN
Each round binds one challenge bit, progressively filtering table entries.

Key Files

  • jolt-core/src/zkvm/instruction_lookups/ra_virtual.rs: RA polynomial sumcheck
  • jolt-core/src/zkvm/instruction_lookups/read_raf_checking.rs: RAF verification
  • jolt-core/src/zkvm/lookup_table/: All subtable implementations
  • jolt-core/src/poly/ra_polynomial.rs: Lazy RA polynomial materialization
  • jolt-core/src/poly/shared_ra_polynomials.rs: Shared equality table optimization

Next Steps

Build docs developers (and LLMs) love