Arena-based heap allocation
Heap objects are stored in separate arenas by type usingslotmap::DenseSlotMap (src/arenas.rs):
- Type-safe keys prevent use-after-free bugs
- Generation counters catch stale references
- Fast allocation and iteration
- Cache-friendly memory layout
Value representation
TheValue enum stores primitives inline and heap objects as keys:
Value small (16 bytes) while supporting large heap objects.
Mark-and-sweep algorithm
Garbage collection happens in two phases:Mark phase
Starting from roots, the GC recursively marks all reachable objects. Roots include:- Values on the operand stack
- All local variables (shared locals vector)
- All global variables
- Constants in all call frames’ instruction sets
GcState struct (src/gc.rs:86) to track marked objects:
Sweep phase
After marking, the GC iterates through each arena and removes unmarked objects:__heap_stats__().
Memory tracking
The GC tracks allocation count and estimated bytes to determine when to collect (src/gc.rs:114):
- Allocation count: 1024 allocations
- Memory threshold: 8 MB
src/gc.rs:56):
Configurable thresholds
Users can adjust GC behavior at runtime:__gc_threshold__() function sets both allocation and memory thresholds proportionally (src/native_registry.rs).
String interning
Strings use thestrena crate for automatic deduplication:
GC statistics
The GC tracks lifetime statistics:total_bytes_freed- cumulative bytes freed across all collectionstotal_collections- number of GC cycles runallocation_count- allocations since last collectionbytes_allocated- estimated bytes since last collection
__heap_stats__():
Performance characteristics
- Mark phase: O(live set size) - traces all reachable objects
- Sweep phase: O(total heap size) - iterates all arenas
- Allocation: O(1) - DenseSlotMap insert
- Trigger overhead: Checked every 256 instructions (via polling counter)
Preventing collection pauses
For latency-sensitive code, increase thresholds:Future optimizations
Potential improvements:- Generational GC: Young generation for short-lived objects
- Incremental marking: Spread mark phase across multiple VM cycles
- Parallel sweep: Use multiple threads for sweeping
- Compaction: Reduce heap fragmentation
- Write barriers: Track inter-generational references
Source references
- GC implementation:
src/gc.rs - Arena definitions:
src/arenas.rs - GC integration in VM:
src/vm/mod.rs:279 - Built-in functions:
src/native_registry.rs