Skip to main content
Preprocessing is a one-time setup phase that prepares a program for efficient proof generation and verification. Jolt uses a structured preprocessing pipeline that separates shared, prover-specific, and verifier-specific data.

Why Preprocessing?

Preprocessing amortizes expensive computations across multiple proofs:
  • Program analysis - Decode bytecode, build lookup tables, compute memory layout
  • Commitment setup - Generate cryptographic commitment parameters
  • Memory initialization - Preprocess initial RAM state from ELF
  • Zero-knowledge setup - Generate Pedersen generators for ZK mode
After preprocessing, the same preprocessed data can be reused for unlimited proofs of the same program.

Preprocessing Phases

Jolt preprocessing happens in three phases:
Program ELF

┌─────────────────────────┐
│ JoltSharedPreprocessing │  ← Shared by prover and verifier
└─────────────────────────┘
    ↓               ↓
    ↓               └──────────────────────┐
    ↓                                      ↓
┌─────────────────────────┐   ┌─────────────────────────────┐
│ JoltProverPreprocessing │   │ JoltVerifierPreprocessing   │
└─────────────────────────┘   └─────────────────────────────┘

Phase 1: Shared Preprocessing

pub struct JoltSharedPreprocessing {
    pub bytecode: Arc<BytecodePreprocessing>,
    pub ram: RAMPreprocessing,
    pub memory_layout: MemoryLayout,
    pub max_padded_trace_length: usize,
    // ZK-specific (only with `zk` feature)
    #[cfg(feature = "zk")]
    pub zk_generator_g1s: Vec<Bn254G1>,
    #[cfg(feature = "zk")]
    pub zk_generator_h1: Bn254G1,
}
Created by:
let shared = JoltSharedPreprocessing::new(
    bytecode,           // Vec<Instruction>
    memory_layout,      // MemoryLayout
    memory_init,        // Vec<(u64, u8)>
    max_trace_length,   // usize
);
Components:
  • BytecodePreprocessing - Program instructions, PC mappings, lookup tables
  • RAMPreprocessing - Initial memory state, bytecode placement addresses
  • MemoryLayout - Address ranges for stack, heap, I/O, advice regions
  • ZK generators - Pedersen commitment generators (ZK mode only)
Usage:
// Generated by #[jolt::provable]
let mut program = compile_my_function("/tmp/jolt");
let (bytecode, memory_init, program_size) = program.decode();

let memory_layout = MemoryLayout::new(&MemoryConfig {
    max_input_size: 4096,
    max_output_size: 4096,
    // ...
});

let shared = JoltSharedPreprocessing::new(
    bytecode,
    memory_layout,
    memory_init,
    1048576,  // max_trace_length
);

Phase 2A: Prover Preprocessing

pub struct JoltProverPreprocessing<F: JoltField, PCS: CommitmentScheme> {
    pub generators: PCS::ProverSetup,
    pub shared: JoltSharedPreprocessing,
}
Created by:
let prover_preprocessing = JoltProverPreprocessing::new(shared);
Components:
  • ProverSetup - Cryptographic commitment keys for Dory/HyperKZG
  • Shared preprocessing - Reference to shared data
Generates commitment generators based on:
  • Maximum trace length
  • Number of committed polynomials
  • Advice region sizes

Phase 2B: Verifier Preprocessing

pub struct JoltVerifierPreprocessing<F: JoltField, PCS: CommitmentScheme> {
    pub generators: PCS::VerifierSetup,
    pub shared: JoltSharedPreprocessing,
}
Created by:
// Option 1: From shared preprocessing
let verifier_generators = DoryCommitmentScheme::setup_verifier(&prover_generators);
let verifier_preprocessing = JoltVerifierPreprocessing::new(
    shared,
    verifier_generators
);

// Option 2: From prover preprocessing
let verifier_preprocessing = JoltVerifierPreprocessing::from(&prover_preprocessing);
Components:
  • VerifierSetup - Verification keys (smaller than prover setup)
  • Shared preprocessing - Reference to shared data

Memory Layout

The MemoryLayout computed during preprocessing determines guest memory organization:
pub struct MemoryLayout {
    // Configuration
    pub max_input_size: u64,
    pub max_output_size: u64,
    pub max_untrusted_advice_size: u64,
    pub max_trusted_advice_size: u64,
    pub stack_size: u64,
    pub heap_size: u64,
    pub program_size: u64,
    
    // Computed addresses
    pub ram_witness_offset: u64,
    pub input_start: u64,
    pub output_start: u64,
    pub untrusted_advice_start: u64,
    pub trusted_advice_start: u64,
    pub panic: u64,
    pub termination: u64,
}
Memory map:
0x0000_0000  Program code (.text, .rodata, .data)
             ...
0x????_????  Stack (grows down)
             ...
0x????_????  Heap (grows up)
             ...
0x????_????  Input region
0x????_????  Output region
0x????_????  Untrusted advice region
0x????_????  Trusted advice region
0x????_????  RAM witness offset
0x????_????  Panic flag
0x????_????  Termination flag

Bytecode Preprocessing

The bytecode preprocessing phase analyzes the RISC-V binary:
pub struct BytecodePreprocessing {
    code_size: usize,               // Number of instructions
    // Efficient PC ↔ index lookups
    // Instruction flag polynomials
    // Opcode/operand decompositions
}
Operations:
  1. Decode ELF - Parse RISC-V instructions from .text section
  2. Build PC mapping - Map program counter values to instruction indices
  3. Compute instruction flags - One-hot encoding for each instruction type
  4. Decompose operands - Split instructions into opcode, rd, rs1, rs2, immediate

RAM Preprocessing

The RAM preprocessing phase handles initial memory state:
pub struct RAMPreprocessing {
    min_bytecode_address: u64,
    bytecode_words: Vec<u64>,
}
Process:
  1. Extract non-zero memory from ELF sections (.data, .rodata, .bss)
  2. Determine program code placement address
  3. Pack initialized data into 64-bit words
  4. Store for efficient polynomial evaluation
Initial RAM includes:
  • Program .text (code)
  • .rodata (read-only data)
  • .data (initialized data)
  • Everything else starts at zero

Serialization

Preprocessing data can be saved and loaded:
// Save
prover_preprocessing.save_to_file("preprocessing.bin")?;

// Load
let loaded = JoltProverPreprocessing::from_file("preprocessing.bin")?;

// Or use generated helpers
verifier_preprocessing.save_to_target_dir("/tmp/jolt")?;
let loaded = JoltVerifierPreprocessing::read_from_target_dir("/tmp/jolt")?;
Typical sizes:
  • Shared preprocessing: ~1-10 MB (depends on program size)
  • Prover preprocessing: ~10-100 MB (depends on trace length)
  • Verifier preprocessing: ~1-10 MB (much smaller than prover)

ZK Mode Preprocessing

When compiled with --features zk, preprocessing includes additional setup:
#[cfg(feature = "zk")]
let generators_count = /* computed from trace length */;
let (zk_generator_g1s, zk_generator_h1) = generate_pedersen_generators(generators_count);
ZK generators are used for:
  • Committing to sumcheck round polynomials
  • BlindFold R1CS witness commitments
  • Hiding polynomial evaluations
See the ZK feature flag documentation for details on zero-knowledge mode.

Generated Preprocessing Functions

The #[jolt::provable] macro generates preprocessing helpers:

preprocess_shared_*

let mut program = compile_my_function("/tmp/jolt");
let shared = preprocess_shared_my_function(&mut program);
Decodes the ELF and creates JoltSharedPreprocessing with the configured memory layout.

preprocess_prover_*

let prover_preprocessing = preprocess_prover_my_function(shared);
Generates prover commitment parameters.

preprocess_verifier_*

let verifier_generators = /* ... */;
let verifier_preprocessing = preprocess_verifier_my_function(
    shared,
    verifier_generators
);
Creates verifier preprocessing from shared preprocessing and generators.

verifier_preprocessing_from_prover_*

let verifier_preprocessing = verifier_preprocessing_from_prover_my_function(
    &prover_preprocessing
);
Convenience function to derive verifier preprocessing from prover preprocessing.

Best Practices

Preprocessing is expensive (seconds to minutes). Reuse the same preprocessing for all proofs of a program.
Save preprocessing to disk. Loading from disk is much faster than regenerating.
The actual execution trace must fit within max_trace_length. Pad trace lengths to powers of 2 for efficiency.
Verifiers only need JoltVerifierPreprocessing, which is much smaller than prover preprocessing.

Complete Example

use guest::*;
use std::path::Path;

fn setup_preprocessing() -> (
    JoltProverPreprocessing<Fr, DoryCommitmentScheme>,
    JoltVerifierPreprocessing<Fr, DoryCommitmentScheme>,
) {
    let target_dir = "/tmp/jolt-guest";
    let preprocessing_path = Path::new(target_dir).join("preprocessing");
    
    // Try loading cached preprocessing
    if let Ok(prover) = JoltProverPreprocessing::from_file(
        preprocessing_path.join("prover.bin")
    ) {
        let verifier = JoltVerifierPreprocessing::from_file(
            preprocessing_path.join("verifier.bin")
        ).unwrap();
        return (prover, verifier);
    }
    
    // Generate new preprocessing
    println!("Preprocessing (one-time setup)...");
    
    let mut program = compile_my_function(target_dir);
    let shared = preprocess_shared_my_function(&mut program);
    let prover = preprocess_prover_my_function(shared.clone());
    let verifier = verifier_preprocessing_from_prover_my_function(&prover);
    
    // Cache for next time
    std::fs::create_dir_all(&preprocessing_path).unwrap();
    prover.save_to_file(preprocessing_path.join("prover.bin")).unwrap();
    verifier.save_to_file(preprocessing_path.join("verifier.bin")).unwrap();
    
    (prover, verifier)
}

fn main() {
    let (prover_preprocessing, verifier_preprocessing) = setup_preprocessing();
    
    // Use preprocessing for multiple proofs
    let program = compile_my_function("/tmp/jolt-guest");
    let prove = build_prover_my_function(program.clone(), prover_preprocessing);
    let verify = build_verifier_my_function(verifier_preprocessing);
    
    // Prove multiple times with same preprocessing
    for i in 0..10 {
        let (output, proof, _) = prove(i);
        assert!(verify(i, output, false, proof));
        println!("Proof {} verified!", i);
    }
}

Performance Characteristics

Preprocessing time:
  • Small programs (less than 1K instructions): ~1-5 seconds
  • Medium programs (~10K instructions): ~10-30 seconds
  • Large programs (~100K instructions): ~1-5 minutes
Proving time (after preprocessing):
  • Simple proofs (less than 10K cycles): ~1-10 seconds
  • Complex proofs (~100K cycles): ~10-60 seconds
  • Very large proofs (~1M cycles): ~1-10 minutes
Preprocessing time is amortized across all proofs. The more proofs you generate, the less preprocessing overhead matters.

Build docs developers (and LLMs) love