Garbage Collection

Overview

Go features a concurrent, tri-color, mark-and-sweep garbage collector that runs alongside your program. It’s designed to minimize pause times while efficiently reclaiming unused memory. Understanding how the GC works helps you write more efficient Go programs.

The Go GC is non-generational and non-compacting. It uses size-segregated allocation and runs concurrently with minimal stop-the-world pauses.

GC Algorithm

High-Level Overview

The GC is type-accurate (precise), concurrent (runs alongside mutators), and supports parallel marking with multiple GC threads. The algorithm builds on Dijkstra’s on-the-fly garbage collection.

Four Phases

1. Sweep Termination

Stop-the-world phase:

All Ps reach a GC safe-point
Sweep any unswept spans (if GC was forced early)

2. Mark Phase

Preparation (stop-the-world):

Set gcphase to _GCmark
Enable write barrier on all Ps
Enable mutator assists
Enqueue root mark jobs

Concurrent marking (world running):

Mark workers (scheduled by runtime) scan objects
Write barrier shades pointers during mutation
Newly allocated objects are immediately marked black
All stacks are scanned, shading found pointers
Grey objects are drained, turning them black

Termination detection:

Uses distributed termination algorithm
Detects when no root jobs or grey objects remain
Transitions to mark termination

3. Mark Termination

Stop-the-world phase:

Set gcphase to _GCmarktermination
Disable workers and assists
Flush mcaches and perform housekeeping

4. Sweep Phase

Preparation:

Set gcphase to _GCoff
Set up sweep state
Disable write barrier

Concurrent sweeping (world running):

Spans swept lazily when needed
Background goroutine sweeps proactively
Sweeping happens before allocation to avoid requesting more OS memory

Concurrent Sweep Details

Sweeping happens in two ways:

Background sweeper - Goroutine that sweeps spans one-by-one
Lazy sweeping - When a goroutine needs a span, it sweeps to reclaim memory

All spans marked "needs sweeping" at STW mark termination
    ↓
Background sweeper + Lazy sweeping during allocation
    ↓
Finalizers run after all spans swept

Write Barrier

The write barrier is critical for maintaining GC correctness during concurrent marking.

Purpose

When the mutator (your code) writes a pointer while GC is marking:

Both the overwritten pointer and new pointer are shaded
Ensures all reachable objects are marked
Prevents lost objects during concurrent marking

Enabled during mark phase
Disabled during sweep and off phases
Checked at compile time with //go:nowritebarrier directives

// Compiler-generated pseudocode
func pointerWrite(dst *unsafe.Pointer, src unsafe.Pointer) {
    if gcphase == _GCmark {
        shade(*dst)  // Old value
        shade(src)   // New value
    }
    *dst = src
}

The write barrier adds overhead but is necessary for correctness. The compiler optimizes it out when proven safe.

GC Pacing

Trigger Mechanism

GC starts when the heap reaches a target size. The target is calculated as:

heapGoal = live_data * (1 + GOGC/100)

Default GOGC=100 means GC triggers when heap reaches 2× live data. Example:

Live data: 4MB
GOGC=100: GC triggers at 8MB
GOGC=200: GC triggers at 12MB
GOGC=50: GC triggers at 6MB

Pacer

The GC pacer manages when to start GC and how much assist work to require:

// Environment variables
GOGC=100          // Target percentage (default)
GOMEMLIMIT=4GiB   // Soft memory limit

The pacer balances:

CPU overhead of GC
Memory overhead of unused objects
Responsiveness (pause times)

GC Rate

Keeps GC cost proportional to allocation cost:

More allocation → more frequent GC
Less allocation → less frequent GC
Linear relationship maintains predictable overhead

Mutator Assists

When allocation outpaces marking, the allocating goroutine must help:

Allocating goroutine
    ↓
Check: Is allocation ahead of marking?
    ↓ Yes
Perform marking work (assist)
    ↓
Continue with allocation

This ensures GC completes before running out of memory.

Controlling Assists

Assist amount proportional to allocation
GC credit system tracks work done
Background workers reduce need for assists

Tuning the GC

GOGC Environment Variable

Controls GC frequency:

# More frequent GC (lower memory, more CPU)
GOGC=50 ./myprogram

# Less frequent GC (more memory, less CPU)
GOGC=200 ./myprogram

# Disable GC (not recommended for production)
GOGC=off ./myprogram

Trade-offs:

Low GOGC: Less memory, more GC CPU overhead, shorter pauses
High GOGC: More memory, less GC CPU overhead, potentially longer pauses

GOMEMLIMIT

Set soft memory limit:

# Limit to 4GiB
GOMEMLIMIT=4GiB ./myprogram

# Limit to 512MiB
GOMEMLIMIT=512MiB ./myprogram

The GC will try to stay under this limit by:

Running GC more frequently
Adjusting target heap size
Still respecting GOGC for minimum frequency

GOMEMLIMIT is a soft limit. The program may temporarily exceed it. It doesn’t prevent OOM if the live data is larger than the limit.

Runtime Control

import "runtime"

// Force GC to run
runtime.GC()

// Set GOGC percentage programmatically
oldGOGC := runtime.SetGCPercent(200)

// Read memory stats
var m runtime.MemStats
runtime.ReadMemStats(&m)
fmt.Printf("Alloc = %v MiB\n", m.Alloc / 1024 / 1024)
fmt.Printf("TotalAlloc = %v MiB\n", m.TotalAlloc / 1024 / 1024)
fmt.Printf("Sys = %v MiB\n", m.Sys / 1024 / 1024)
fmt.Printf("NumGC = %v\n", m.NumGC)

Memory Statistics

Key MemStats Fields

type MemStats struct {
    // General
    Alloc      uint64  // Bytes allocated and in use
    TotalAlloc uint64  // Cumulative bytes allocated
    Sys        uint64  // Bytes obtained from OS
    Lookups    uint64  // Number of pointer lookups
    Mallocs    uint64  // Cumulative malloc count
    Frees      uint64  // Cumulative free count
    
    // Heap
    HeapAlloc    uint64  // Bytes allocated in heap
    HeapSys      uint64  // Bytes obtained for heap
    HeapIdle     uint64  // Bytes in idle spans
    HeapInuse    uint64  // Bytes in in-use spans
    HeapReleased uint64  // Bytes released to OS
    HeapObjects  uint64  // Total objects allocated
    
    // GC
    NumGC       uint32    // Number of completed GCs
    PauseNs     [256]uint64  // Recent GC pause durations
    PauseEnd    [256]uint64  // Recent GC pause end times
    LastGC      uint64    // Time of last GC (UnixNano)
    GCCPUFraction float64 // Fraction of CPU time in GC
}

Monitoring GC

func printGCStats() {
    var m runtime.MemStats
    runtime.ReadMemStats(&m)
    
    fmt.Printf("GC runs: %d\n", m.NumGC)
    fmt.Printf("GC CPU%%: %.2f\n", m.GCCPUFraction * 100)
    
    if m.NumGC > 0 {
        // Last pause in milliseconds
        lastPause := m.PauseNs[(m.NumGC+255)%256]
        fmt.Printf("Last pause: %.2f ms\n", float64(lastPause)/1e6)
    }
}

Optimization Strategies

1. Reduce Allocations

Fewer allocations mean less GC work:

// Bad: Allocates on each call
func process() {
    buf := make([]byte, 1024)
    // Use buf...
}

// Good: Reuse buffer
var bufPool = sync.Pool{
    New: func() any { return make([]byte, 1024) },
}

func process() {
    buf := bufPool.Get().([]byte)
    defer bufPool.Put(buf)
    // Use buf...
}

2. Use Pointers Judiciously

Large values passed by value trigger copies:

// Bad: Copies entire struct
func process(data BigStruct) { ... }

// Good: Passes pointer
func process(data *BigStruct) { ... }

But avoid pointer-heavy structures that increase scan work:

// Many pointers = more GC scan work
type Heavy struct {
    p1, p2, p3, p4 *int
    p5, p6, p7, p8 *string
}

// Fewer pointers = less GC scan work
type Light struct {
    values [8]int
    name   string
}

3. Use sync.Pool

Reuse objects instead of reallocating:

var requestPool = sync.Pool{
    New: func() any {
        return &Request{}
    },
}

func handleRequest() {
    req := requestPool.Get().(*Request)
    defer requestPool.Put(req)
    
    // Reset and use req...
}

sync.Pool objects may be cleared at any GC. Don’t rely on pooled objects persisting across GCs.

4. Preallocate Slices

// Bad: Multiple allocations as slice grows
items := []Item{}
for i := 0; i < 1000; i++ {
    items = append(items, Item{})
}

// Good: Single allocation
items := make([]Item, 0, 1000)
for i := 0; i < 1000; i++ {
    items = append(items, Item{})
}

5. Consider Value Types

Using value types instead of pointers can reduce GC scan time:

// Pointer-based: GC must scan every element
type NodePtr struct {
    left, right *NodePtr
    value       int
}

// Value-based with indices: GC scans array, not individual nodes
type NodeVal struct {
    left, right int  // Indices into nodes slice
    value       int
}
var nodes []NodeVal

GC Debugging

GODEBUG gctrace

See GC activity:

GODEBUG=gctrace=1 ./myprogram

Output:

gc 1 @0.004s 0%: 0.018+0.46+0.003 ms clock, 0.14+0.25/0.38/0.11+0.027 ms cpu, 4->4->2 MB, 5 MB goal, 8 P

Fields:

gc 1: GC number
@0.004s: Time since program start
0%: Percentage of time in GC since start
4->4->2 MB: Heap size at start, end, and live data
5 MB goal: Target heap size
8 P: Number of processors

Trace GC Events

import "runtime/trace"

func main() {
    f, _ := os.Create("trace.out")
    defer f.Close()
    
    trace.Start(f)
    defer trace.Stop()
    
    // Your code...
}

View with:

go tool trace trace.out

GC Metrics

import "runtime/metrics"

func readGCMetrics() {
    samples := []metrics.Sample{
        {Name: "/gc/cycles/total:gc-cycles"},
        {Name: "/gc/heap/goal:bytes"},
        {Name: "/gc/heap/live:bytes"},
    }
    
    metrics.Read(samples)
    
    for _, sample := range samples {
        fmt.Printf("%s: %v\n", sample.Name, sample.Value)
    }
}

Advanced Topics

Finalizers

Run code when object is garbage collected:

type Resource struct {
    handle uintptr
}

func NewResource() *Resource {
    r := &Resource{handle: openHandle()}
    runtime.SetFinalizer(r, func(r *Resource) {
        closeHandle(r.handle)
    })
    return r
}

Finalizers are not guaranteed to run. They add overhead and can delay object reclamation. Use them sparingly, prefer explicit cleanup.

Oblets

For large objects (>128KB), GC breaks scanning into “oblets”:

Scan object in chunks
Improves parallelism
Reduces pause time spikes
Each oblet is a separate work unit

Heap Profiling

Profile memory allocations:

# Run with memory profiling
go test -memprofile mem.prof

# Analyze
go tool pprof mem.prof

Or programmatically:

f, _ := os.Create("mem.prof")
pprof.WriteHeapProfile(f)
f.Close()

Common Issues

Excessive GC

Symptoms: High GCCPUFraction, frequent GC cycles Solutions:

Increase GOGC to reduce frequency
Reduce allocation rate
Use object pooling
Preallocate slices/maps

Memory Leaks

Symptoms: Growing memory, GC doesn’t help Causes:

Global variables holding references
Goroutine leaks (goroutines stuck with references)
Unclosed resources with finalizers
Large slice retaining backing array

Debug:

import _ "net/http/pprof"

// Visit http://localhost:6060/debug/pprof/heap
go http.ListenAndServe(":6060", nil)

GC Pauses

Symptoms: Application freezes/latency spikes Solutions:

Lower GOGC for more frequent, shorter GCs
Reduce pointer-heavy structures
Reduce heap size
Use value types where possible

Best Practices

Profile before optimizing - Use pprof and trace
Reduce allocations - Reuse objects, preallocate
Set appropriate GOGC - Balance memory vs CPU
Use GOMEMLIMIT - In containerized environments
Avoid finalizers - Use explicit cleanup
Monitor GC metrics - Track GCCPUFraction in production
Test with realistic load - GC behavior depends on allocation patterns

The default GC settings work well for most applications. Only tune if profiling shows GC is a bottleneck.

Get Started

Language Guide

Standard Library

Core Packages

Tools & Commands

Advanced Topics

Contributing

​Overview

​GC Algorithm

​High-Level Overview

​Four Phases

​1. Sweep Termination

​2. Mark Phase

​3. Mark Termination

​4. Sweep Phase

​Concurrent Sweep Details

​Write Barrier

​Purpose

​When Active

​GC Pacing

​Trigger Mechanism

​Pacer

​GC Rate

​Mutator Assists

​Controlling Assists

​Tuning the GC

​GOGC Environment Variable

​GOMEMLIMIT

​Runtime Control

​Memory Statistics

​Key MemStats Fields

​Monitoring GC

​Optimization Strategies

​1. Reduce Allocations

​2. Use Pointers Judiciously

​3. Use sync.Pool

​4. Preallocate Slices

​5. Consider Value Types

​GC Debugging

​GODEBUG gctrace

​Trace GC Events

​GC Metrics

​Advanced Topics

​Finalizers

​Oblets

​Heap Profiling

​Common Issues

​Excessive GC

​Memory Leaks

​GC Pauses

​Best Practices

​References

​Related Topics

Build docs developers (and LLMs) love

Overview

GC Algorithm

High-Level Overview

Four Phases

1. Sweep Termination

2. Mark Phase

3. Mark Termination

4. Sweep Phase

Concurrent Sweep Details

Write Barrier

Purpose

When Active

GC Pacing

Trigger Mechanism

Pacer

GC Rate

Mutator Assists

Controlling Assists

Tuning the GC

GOGC Environment Variable

GOMEMLIMIT

Runtime Control

Memory Statistics

Key MemStats Fields

Monitoring GC

Optimization Strategies

1. Reduce Allocations

2. Use Pointers Judiciously

3. Use sync.Pool

4. Preallocate Slices

5. Consider Value Types

GC Debugging

GODEBUG gctrace

Trace GC Events

GC Metrics

Advanced Topics

Finalizers

Oblets

Heap Profiling

Common Issues

Excessive GC

Memory Leaks

GC Pauses

Best Practices

References

Related Topics