Garbage Collector Design
CPython uses a combination of reference counting and a cyclic garbage collector to manage memory automatically.Reference Counting
The primary memory management mechanism in CPython is reference counting. Every Python object has a reference count that tracks how many references point to it.Basic Concept
- When a reference to an object is created, its reference count increases
- When a reference is deleted, the count decreases
- When the count reaches zero, the object is immediately deallocated
Example
sys.getrefcount() always returns 1 more than the actual count because the function itself holds a temporary reference.The Cycle Problem
Reference counting cannot handle reference cycles:Two GC Implementations
Starting in Python 3.13, CPython has two GC implementations:- Default build - Uses global interpreter lock (GIL) for thread safety
- Free-threaded build - Pauses threads during collection
Memory Layout
Default Build
Objects tracked by GC have extra header fields:PyGC_Head fields form doubly-linked lists for GC generations.
Free-Threaded Build
Uses anob_gc_bits byte in every object:
ob_gc_bits field tracks GC state without linked lists.
Cycle Detection Algorithm
The GC identifies unreachable cycles using a mark-and-sweep approach:Phase 1: Initial Marking
- Copy each object’s refcount to a temporary
gc_reffield - For each object, decrement
gc_refof every object it references - Objects with
gc_ref > 0are directly reachable from outside
Phase 2: Propagation
- Objects with
gc_ref == 0are marked “tentatively unreachable” - For each object with
gc_ref > 0, traverse all references - Move referenced objects back to “reachable” list
- Mark visited objects to avoid processing twice
Phase 3: Result
Objects still in the “tentatively unreachable” list are truly unreachable and can be collected.Key Insight: This algorithm requires no recursion and uses O(1) extra storage beyond the objects themselves.
Destroying Unreachable Objects
Once unreachable objects are identified, destruction follows these steps:- Handle weak references with callbacks (only for reachable weakrefs)
- Move legacy finalizers (
tp_del) togc.garbagelist - Call finalizers (
tp_finalize) and mark objects as finalized - Handle resurrection - If finalizers revive objects, re-run cycle detection
- Clear weak references - Set
wr_objecttoNone - Call
tp_clear- Break internal references, triggering deallocation
Incremental Collection (Default Build)
The default build uses generational collection with two generations:- Young generation - Newly created objects
- Old generation - Objects that survived previous collections
Why Generations?
Most objects die young (weak generational hypothesis). By collecting young objects frequently and old objects incrementally, pause times stay low.Old Generation Structure
The old generation has two lists:- Pending - Objects not yet scanned this cycle
- Visited - Objects already scanned
Increments
Each collection scans:- The entire young generation
- Least recently scanned portion of old generation
- Transitive closure of references from these objects
Correctness
To ensure complete cycle detection, the algorithm computes the transitive closure of reachable objects from the initial increment. This guarantees no partial cycles are included.The free-threaded build does not use generational collection. Every collection scans the entire heap.
Excluding Reachable Objects
Both implementations optimize by identifying definitely reachable objects before cycle detection.Default Build
Identifies reachable objects from:- Builtin classes
sysandbuiltinsmodules- Stack frames
Free-Threaded Build
The “mark alive” phase sets_PyGC_BITS_ALIVE flag on:
interp->sysdict,interp->builtins,interp->types- All objects reachable from active frames
- Transitive closure of above
Software Prefetch (Free-Threaded Only)
For large object sets, the free-threaded GC uses software prefetch instructions during the mark alive phase:- Prefetch buffer (FIFO) queues objects before access
- Issuing prefetch and accessing object are separated by buffer size
- Parameters: max size 256, low threshold 8, high threshold 16
- Enabled only when long-lived object count > 200,000
Optimization: Reusing Fields
The default build reusesPyGC_Head fields as “fat pointers” (tagged pointers):
_gc_prev Field
- Normally: previous pointer in doubly-linked list
- Bits 0-1: Store
PREV_MASK_COLLECTINGand_PyGC_PREV_MASK_FINALIZEDflags - During collection: Temporarily stores
gc_refcopy
_gc_next Field
- Normally: next pointer in doubly-linked list
- Bit 0:
NEXT_MASK_UNREACHABLEflag during collection
Warning: Fat pointers must be “untagged” (bits masked off) before dereferencing.
Optimization: Delayed Untracking
Containers that cannot participate in cycles are untracked to reduce GC overhead.Tuples
Tuples containing only immutable objects (recursively) don’t need tracking:Implementation Differences
Default Build
- Uses
PyGC_Headdoubly-linked lists - Generational collection (young + old)
- Relies on GIL for thread safety
- Flags stored in
_gc_prevfield
Free-Threaded Build
- Uses mimalloc allocator to find tracked objects
- Non-generational (scans entire heap)
- Stop-the-world pauses for thread safety
- Flags stored in
ob_gc_bitsfield - Unreachable list uses repurposed
ob_tidfield
Configuration
Thresholds
threshold0- Allocations minus deallocations before collectionthreshold1- Controls old generation fraction (default build only)threshold2- Ignored
Manual Control
C API
Key functions in the GC C API:PyObject_GC_New()- Allocate GC-tracked objectPyObject_GC_Track()- Start tracking objectPyObject_GC_UnTrack()- Stop tracking objectPyObject_GC_Del()- Deallocate GC object
- Include
Py_TPFLAGS_HAVE_GCintp_flags - Provide
tp_traverseimplementation - Provide
tp_clearimplementation (usually)
Related Topics
- Object Layout - 3.11+ object structure
- Code Objects - Compiled code representation
- Frames - Frame object lifecycle
