Bytecode Preprocessing

Overview

Bytecode preprocessing transforms the program’s RISC-V instructions into a structured format optimized for zero-knowledge proving. This includes PC (program counter) mapping for virtual instruction sequences, padding for power-of-two polynomial encoding, and commitment to the bytecode as a public input.

Role in the Proving System

Bytecode preprocessing is a one-time setup operation per program:

Stage: Preprocessing (before proving)
Input: RISC-V ELF binary decoded into Vec<Instruction>
Processing: PC mapping, padding, commitment
Output: BytecodePreprocessing struct used during proving/verification
Verification: Bytecode commitment is public input (verifier checks match)

Bytecode Structure

Instruction Format

pub struct Instruction {
    pub address: usize,                       // Physical memory address
    pub opcode: RiscvOpcode,                  // ADD, XOR, LOAD, etc.
    pub rs1: u8, pub rs2: u8, pub rd: u8,    // Register indices
    pub imm: i64,                             // Immediate value
    pub virtual_sequence_remaining: Option<u16>,  // For inlined instructions
}

Key fields:

address: Physical address in ELF (must be aligned to ALIGNMENT_FACTOR_BYTECODE)
opcode: Determines which subtables/constraints apply
virtual_sequence_remaining: Counts down for multi-cycle virtual instructions

Physical vs. Virtual Instructions

Physical instructions:

Standard RISC-V opcodes (one instruction = one cycle)
Address uniquely identifies instruction

Virtual instructions:

Multi-cycle operations (e.g., SHA-256 compression function)
Multiple cycles share same physical address
Distinguished by virtual_sequence_remaining counter

PC Mapping

The program counter (PC) must map physical instruction addresses to sequential virtual PCs:

Why PC Mapping?

Problem: Physical addresses are non-contiguous (gaps due to alignment, data sections). Solution: Map to dense virtual PC space [0, num_instructions).

BytecodePCMapper

pub struct BytecodePCMapper {
    // Indexed by: (address - RAM_START_ADDRESS) / ALIGNMENT_FACTOR_BYTECODE
    indices: Vec<Option<(base_pc, max_inline_seq)>>,
}

Mapping logic:

pub fn get_pc(&self, address: usize, virtual_sequence_remaining: u16) -> usize {
    let (base_pc, max_inline_seq) = self.indices[get_index(address)]?;
    base_pc + (max_inline_seq - virtual_sequence_remaining) as usize
}

Example:

Physical address: 0x80001000
Virtual sequence: [5, 4, 3, 2, 1, 0] (6 cycles)

PC mapping:
  0x80001000, seq=5 → PC=10
  0x80001000, seq=4 → PC=11
  0x80001000, seq=3 → PC=12
  ...
  0x80001000, seq=0 → PC=15

No-Op Padding

Bytecode is prepended with a single no-op instruction at PC=0:

bytecode.insert(0, Instruction::NoOp);

Purpose: Simplifies constraint checks (avoids special-casing PC=0).

Bytecode Padding

For efficient polynomial encoding, bytecode is padded to a power of two:

let code_size = bytecode.len().next_power_of_two().max(2);
bytecode.resize(code_size, Instruction::NoOp);

Why power of two?

Enables efficient multilinear polynomial representation
Simplifies sumcheck round structure (each bit of PC becomes one round)
Allows binary tree-like evaluation strategies

Memory cost: Minimal (bytecode typically less than 1MB even for large programs).

Bytecode Commitment

The preprocessed bytecode is committed via polynomial commitment:

Commitment Process

Convert to polynomial:

Bytecode[i] = encoding of instruction at PC=i

Commit: Use Dory polynomial commitment scheme

bytecode_commitment = Commit(Bytecode_polynomial)

Public input: bytecode_commitment is part of verification key

Encoding Format

Each instruction encoded as field element(s):

Opcode (enumeration)
Register indices (rs1, rs2, rd)
Immediate value
Flags (virtual sequence, etc.)

Actual encoding uses multiple polynomials (one per field) for efficiency.

Bytecode Checking (Stage 3)

During proving, Jolt verifies execution trace matches committed bytecode:

Stage 3: Bytecode RAF Checking

Purpose: Prove that every executed instruction matches the committed bytecode. Sumcheck claim:

Σ_{pc ∈ executed_PCs} eq(r, pc) · (traced_instr[pc] - bytecode[pc]) = 0

If any traced instruction doesn’t match bytecode, the check fails. RAF polynomial:

BytecodeRAF[pc] = 1 if PC is executed, else 0

Combined with read-RAF checking (like RAM/registers):

Σ_{pc} BytecodeRAF[pc] · (traced_instr[pc] - bytecode[pc]) = 0

Bytecode vs. RAM Memory

Bytecode is stored in RAM but treated specially:

RAM Region

┌─────────────────────┐ ← 0
│  Bytecode Words     │ ← remap_address(min_bytecode_address)
├─────────────────────┤
│  ...                │

Bytecode bytes are packed into 64-bit words (little-endian):

for chunk in memory_init.chunk_by(|a, b| a.0 / 8 == b.0 / 8) {
    let mut word = [0u8; 8];
    for (address, byte) in chunk {
        word[(address % 8) as usize] = *byte;
    }
    bytecode_words.push(u64::from_le_bytes(word));
}

Checking Strategy

RAM checking: Verifies bytecode memory is consistent (reads match initial writes) Bytecode checking: Verifies executed instructions match bytecode polynomial Both are necessary:

RAM checking: Ensures bytecode wasn’t modified during execution
Bytecode checking: Ensures traced instructions match committed program

RAMPreprocessing

pub struct RAMPreprocessing {
    pub min_bytecode_address: u64,    // Lowest bytecode address
    pub bytecode_words: Vec<u64>,     // Packed instruction bytes
}

Generation:

impl RAMPreprocessing {
    pub fn preprocess(memory_init: Vec<(u64, u8)>) -> Self {
        // Find address range
        let min_addr = memory_init.iter().map(|(a, _)| *a).min();
        let max_addr = memory_init.iter().map(|(a, _)| *a).max();
        
        // Pack bytes into words
        let bytecode_words = pack_bytes_to_words(memory_init);
        
        Self { min_bytecode_address: min_addr, bytecode_words }
    }
}

Performance Characteristics

Preprocessing Time

PC mapping:

O(num_instructions) time
~1-10ms for typical programs

Bytecode commitment:

O(code_size · log(code_size)) field operations
~10-100ms for typical programs (one-time cost)

Proving Time

Bytecode RAF checking (Stage 3):

O(trace_length · log(code_size)) field operations
~5-10% of total proving time

Memory Usage

Preprocessing storage:

BytecodePreprocessing: ~code_size bytes
RAMPreprocessing: ~code_size / 8 · 8 bytes (packed words)
Typical: less than 1MB even for large programs

Verifier Time

O(log(code_size)) for bytecode commitment check
O(log(trace_length)) for RAF sumcheck verification
~0.1-1ms total

Compressed Instructions

Jolt supports RISC-V compressed (16-bit) instructions:

Decompression

if (first_halfword & 0b11) != 0b11 {
    // Compressed instruction (16-bit)
    let inst = uncompress_instruction(first_halfword, xlen);
    instructions.push(Instruction::decode(inst, address, true));
    offset += 2;
} else {
    // Standard instruction (32-bit)
    let word = u32::from_le_bytes([...]);
    instructions.push(Instruction::decode(word, address, false));
    offset += 4;
}

Alignment: Compressed instructions can be unaligned (2-byte boundaries), but Jolt’s ALIGNMENT_FACTOR_BYTECODE typically requires 4-byte alignment after decompression.

Integration with Prover Pipeline

Bytecode preprocessing integrates with the full pipeline:

Decode ELF → Extract instruction bytes
Preprocess → PC mapping, padding, commitment
Trace → Emulator uses PC mapper to resolve virtual PCs
Witness gen → Bytecode polynomial constructed from preprocessed data
Stage 3 → Bytecode RAF checking proves trace matches bytecode
Verification → Verifier checks bytecode commitment matches public input

Key Files

jolt-core/src/zkvm/bytecode/mod.rs: BytecodePreprocessing and BytecodePCMapper
jolt-core/src/zkvm/bytecode/read_raf_checking.rs: Bytecode RAF sumcheck
jolt-core/src/zkvm/ram/mod.rs: RAMPreprocessing (bytecode memory representation)
tracer/src/lib.rs: ELF decoding and instruction extraction (decode function)

Next Steps

Emulation: How tracer uses PC mapping during execution
Instruction Execution: Verifying instruction lookups
RAM Checking: Memory consistency verification

Overview

Core Components

Protocols

Optimizations

Theory

Bytecode Preprocessing

Overview

Role in the Proving System

Bytecode Structure

Instruction Format

Physical vs. Virtual Instructions

PC Mapping

Why PC Mapping?

BytecodePCMapper

No-Op Padding

Bytecode Padding

Bytecode Commitment

Commitment Process

Encoding Format

Bytecode Checking (Stage 3)

Stage 3: Bytecode RAF Checking

Bytecode vs. RAM Memory

RAM Region

Checking Strategy

RAMPreprocessing

Performance Characteristics

Preprocessing Time

Proving Time

Memory Usage

Verifier Time

Compressed Instructions

Decompression

Integration with Prover Pipeline

Key Files

Next Steps

Build docs developers (and LLMs) love

Overview

Core Components

Protocols

Optimizations

Theory

​Overview

​Role in the Proving System

​Bytecode Structure

​Instruction Format

​Physical vs. Virtual Instructions

​PC Mapping

​Why PC Mapping?

​BytecodePCMapper

​No-Op Padding

​Bytecode Padding

​Bytecode Commitment

​Commitment Process

​Encoding Format

​Bytecode Checking (Stage 3)

​Stage 3: Bytecode RAF Checking

​Bytecode vs. RAM Memory

​RAM Region

​Checking Strategy

​RAMPreprocessing

​Performance Characteristics

​Preprocessing Time

​Proving Time

​Memory Usage

​Verifier Time

​Compressed Instructions

​Decompression

​Integration with Prover Pipeline

​Key Files

​Next Steps

Build docs developers (and LLMs) love

Overview

Role in the Proving System

Bytecode Structure

Instruction Format

Physical vs. Virtual Instructions

PC Mapping

Why PC Mapping?

BytecodePCMapper

No-Op Padding

Bytecode Padding

Bytecode Commitment

Commitment Process

Encoding Format

Bytecode Checking (Stage 3)

Stage 3: Bytecode RAF Checking

Bytecode vs. RAM Memory

RAM Region

Checking Strategy

RAMPreprocessing

Performance Characteristics

Preprocessing Time

Proving Time

Memory Usage

Verifier Time

Compressed Instructions

Decompression

Integration with Prover Pipeline

Key Files

Next Steps