CompactPolynomial is a specialized polynomial type that stores coefficients as small scalars (u8, u16, u32, u64, u128, i64, i128) instead of full field elements. This optimization provides 10-100x memory reduction and significant speedups in the sumcheck inner loop.
Problem: Field Element Overhead
Jolt uses BN254’s scalar field (Fr), where each element is 32 bytes:- Instruction flags: 0 or 1 (boolean)
- Memory addresses: 0 to 2^32 - 1 (u32)
- Register indices: 0 to 31 (u8)
- Lookup indices: small integers
Solution: Compact Representation
CompactPolynomial<T, F> stores coefficients as small type T and lazily converts to field elements:
Memory Savings
| Coefficient Type | Bytes/Element | Memory for 2^20 coeffs | vs. Field Element |
|---|---|---|---|
bool | 1 | 1 MB | 32x reduction |
u8 | 1 | 1 MB | 32x reduction |
u16 | 2 | 2 MB | 16x reduction |
u32 | 4 | 4 MB | 8x reduction |
u64 | 8 | 8 MB | 4x reduction |
u128 / i128 | 16 | 16 MB | 2x reduction |
Field element F | 32 | 32 MB | 1x (baseline) |
Implementation Details
Small Scalar Trait
Location:jolt-core/src/utils/small_scalar.rs
Two-Phase Representation
CompactPolynomial has two states:
1. Initial State (coeffs only):
bind(), subsequent operations use bound_coeffs.
Optimized Binding
The firstbind() converts small scalars to field elements using optimized arithmetic:
Key Optimizations
1. Lazy Conversion
Coefficients stay as small scalars until binding:2. Optimized Arithmetic
3. Special Case Detection
4. Parallel Binding
Evaluation Optimizations
Inside-Out Evaluation
Faster polynomial evaluation based on Fast Polynomial Evaluation:Split EQ Evaluation
Optimized evaluation when combined with EQ polynomial:Usage in Jolt
MultilinearPolynomial Dispatch
Location:jolt-core/src/poly/multilinear_polynomial.rs
Witness Polynomials
Many witness polynomials use compact representation:Performance Impact
Sumcheck Inner Loop
The sumcheck inner loop dominates prover time:- Memory bandwidth: 2-32x reduction in bytes read
- Cache efficiency: More coefficients fit in L1/L2 cache
- Arithmetic: Faster small-scalar multiplication
Concrete Measurements
For a program with 2^20 cycles:| Polynomial | Type | Memory (Dense) | Memory (Compact) | Savings |
|---|---|---|---|---|
| Instruction flags | bool | 32 MB | 1 MB | 32x |
| Memory addresses | u64 | 32 MB | 8 MB | 4x |
| Operand values | u64 | 32 MB | 8 MB | 4x |
| Total (10 polys) | — | 320 MB | 60 MB | 5.3x |
Best Practices
When to Use CompactPolynomial
✅ Use CompactPolynomial when:- Coefficients are small integers (< 2^64)
- Polynomial is large (> 2^16 coefficients)
- Memory is constrained
- Polynomial is bound multiple times in sumcheck
- Coefficients are arbitrary field elements
- Polynomial is very small (< 2^10 coefficients)
- Polynomial is evaluated but never bound
Choosing Scalar Type
Pick the smallest type that fits your values:Performance Checklist
- Use smallest scalar type that fits values
- Pre-allocate vectors for bound_coeffs using
unsafe_allocate_zero_vec - Avoid clones in hot paths (use references)
- Profile to verify memory savings translate to speedup
Related Optimizations
- EQ Optimizations: Fast EQ polynomial evaluation
- Batched Sumcheck: Protocol using polynomial binding
- SharedRaPolynomials: Share EQ tables across polynomials
References
- Implementation:
jolt-core/src/poly/compact_polynomial.rs - Fast Polynomial Evaluation — Inside-out evaluation technique