Overview
Go features a concurrent, tri-color, mark-and-sweep garbage collector that runs alongside your program. It’s designed to minimize pause times while efficiently reclaiming unused memory. Understanding how the GC works helps you write more efficient Go programs.The Go GC is non-generational and non-compacting. It uses size-segregated allocation and runs concurrently with minimal stop-the-world pauses.
GC Algorithm
High-Level Overview
The GC is type-accurate (precise), concurrent (runs alongside mutators), and supports parallel marking with multiple GC threads. The algorithm builds on Dijkstra’s on-the-fly garbage collection.Four Phases
1. Sweep Termination
Stop-the-world phase:- All Ps reach a GC safe-point
- Sweep any unswept spans (if GC was forced early)
2. Mark Phase
Preparation (stop-the-world):- Set
gcphaseto_GCmark - Enable write barrier on all Ps
- Enable mutator assists
- Enqueue root mark jobs
- Mark workers (scheduled by runtime) scan objects
- Write barrier shades pointers during mutation
- Newly allocated objects are immediately marked black
- All stacks are scanned, shading found pointers
- Grey objects are drained, turning them black
- Uses distributed termination algorithm
- Detects when no root jobs or grey objects remain
- Transitions to mark termination
3. Mark Termination
Stop-the-world phase:- Set
gcphaseto_GCmarktermination - Disable workers and assists
- Flush mcaches and perform housekeeping
4. Sweep Phase
Preparation:- Set
gcphaseto_GCoff - Set up sweep state
- Disable write barrier
- Spans swept lazily when needed
- Background goroutine sweeps proactively
- Sweeping happens before allocation to avoid requesting more OS memory
Concurrent Sweep Details
Sweeping happens in two ways:- Background sweeper - Goroutine that sweeps spans one-by-one
- Lazy sweeping - When a goroutine needs a span, it sweeps to reclaim memory
Write Barrier
The write barrier is critical for maintaining GC correctness during concurrent marking.Purpose
When the mutator (your code) writes a pointer while GC is marking:- Both the overwritten pointer and new pointer are shaded
- Ensures all reachable objects are marked
- Prevents lost objects during concurrent marking
When Active
- Enabled during mark phase
- Disabled during sweep and off phases
- Checked at compile time with
//go:nowritebarrierdirectives
The write barrier adds overhead but is necessary for correctness. The compiler optimizes it out when proven safe.
GC Pacing
Trigger Mechanism
GC starts when the heap reaches a target size. The target is calculated as:GOGC=100 means GC triggers when heap reaches 2× live data.
Example:
- Live data: 4MB
GOGC=100: GC triggers at 8MBGOGC=200: GC triggers at 12MBGOGC=50: GC triggers at 6MB
Pacer
The GC pacer manages when to start GC and how much assist work to require:- CPU overhead of GC
- Memory overhead of unused objects
- Responsiveness (pause times)
GC Rate
Keeps GC cost proportional to allocation cost:- More allocation → more frequent GC
- Less allocation → less frequent GC
- Linear relationship maintains predictable overhead
Mutator Assists
When allocation outpaces marking, the allocating goroutine must help:Controlling Assists
- Assist amount proportional to allocation
- GC credit system tracks work done
- Background workers reduce need for assists
Tuning the GC
GOGC Environment Variable
Controls GC frequency:- Low GOGC: Less memory, more GC CPU overhead, shorter pauses
- High GOGC: More memory, less GC CPU overhead, potentially longer pauses
GOMEMLIMIT
Set soft memory limit:- Running GC more frequently
- Adjusting target heap size
- Still respecting
GOGCfor minimum frequency
Runtime Control
Memory Statistics
Key MemStats Fields
Monitoring GC
Optimization Strategies
1. Reduce Allocations
Fewer allocations mean less GC work:2. Use Pointers Judiciously
Large values passed by value trigger copies:3. Use sync.Pool
Reuse objects instead of reallocating:sync.Pool objects may be cleared at any GC. Don’t rely on pooled objects persisting across GCs.4. Preallocate Slices
5. Consider Value Types
Using value types instead of pointers can reduce GC scan time:GC Debugging
GODEBUG gctrace
See GC activity:gc 1: GC number@0.004s: Time since program start0%: Percentage of time in GC since start4->4->2 MB: Heap size at start, end, and live data5 MB goal: Target heap size8 P: Number of processors
Trace GC Events
GC Metrics
Advanced Topics
Finalizers
Run code when object is garbage collected:Oblets
For large objects (>128KB), GC breaks scanning into “oblets”:- Scan object in chunks
- Improves parallelism
- Reduces pause time spikes
- Each oblet is a separate work unit
Heap Profiling
Profile memory allocations:Common Issues
Excessive GC
Symptoms: HighGCCPUFraction, frequent GC cycles
Solutions:
- Increase
GOGCto reduce frequency - Reduce allocation rate
- Use object pooling
- Preallocate slices/maps
Memory Leaks
Symptoms: Growing memory, GC doesn’t help Causes:- Global variables holding references
- Goroutine leaks (goroutines stuck with references)
- Unclosed resources with finalizers
- Large slice retaining backing array
GC Pauses
Symptoms: Application freezes/latency spikes Solutions:- Lower
GOGCfor more frequent, shorter GCs - Reduce pointer-heavy structures
- Reduce heap size
- Use value types where possible
Best Practices
- Profile before optimizing - Use
pprofandtrace - Reduce allocations - Reuse objects, preallocate
- Set appropriate GOGC - Balance memory vs CPU
- Use GOMEMLIMIT - In containerized environments
- Avoid finalizers - Use explicit cleanup
- Monitor GC metrics - Track
GCCPUFractionin production - Test with realistic load - GC behavior depends on allocation patterns
The default GC settings work well for most applications. Only tune if profiling shows GC is a bottleneck.
References
- Go GC Guide
- runtime API reference
- GC Handbook by Richard Jones