Skip to main content

Concurrent Scanning Architecture

Pumu uses Go’s goroutines and semaphore-based throttling to achieve blazingly fast performance when scanning and deleting large directory trees.

Goroutines with Bounded Concurrency

From scanner.go:167-201, here’s how Pumu calculates folder sizes concurrently:
func calculateFolderSizes(targets []string) []TargetFolder {
    var wg sync.WaitGroup
    var mu sync.Mutex
    var folders []TargetFolder

    sem := make(chan struct{}, 20)  // Semaphore: max 20 concurrent operations

    for _, tPath := range targets {
        wg.Add(1)
        go func(p string) {
            defer wg.Done()
            sem <- struct{}{}           // Acquire semaphore
            defer func() { <-sem }()    // Release semaphore

            size, err := dirSize(p)
            if err != nil {
                size = 0
            }

            mu.Lock()
            folders = append(folders, TargetFolder{Path: p, Size: size})
            mu.Unlock()
        }(tPath)
    }

    wg.Wait()
    return folders
}
Key components:
ComponentPurposeImplementation
WaitGroupWait for all goroutines to completesync.WaitGroup
MutexProtect shared folders slicesync.Mutex
SemaphoreLimit max concurrent operations to 20Buffered channel chan struct{}
Atomic opsThread-safe size accumulation during deletionatomic.AddInt64()

Why Limit to 20 Concurrent Operations?

Each directory traversal opens file descriptors. Most systems limit open file descriptors per process (typically 1024-4096).20 concurrent operations balances:
  • ✅ Fast parallel scanning
  • ✅ Avoiding file descriptor exhaustion
  • ✅ Reasonable CPU/memory usage
On traditional HDDs, too many concurrent reads cause disk thrashing. SSDs handle concurrency better, but 20 is a safe default for both.For NVMe SSDs or NFS mounts, you could theoretically increase this limit by modifying the semaphore buffer size in the source code.

Performance Optimizations

1. Concurrent Size Calculation

Before goroutines (sequential):
# 100 folders, 2s per folder = 200 seconds total
With goroutines (parallel):
# 100 folders, 20 concurrent, 2s per folder = ~10 seconds total
Speedup: ~20x faster on large directory trees.

2. Concurrent Deletion

From scanner.go:218-234, deletion also uses the same semaphore pattern:
if !dryRun {
    sem := make(chan struct{}, 20)
    for _, folder := range folders {
        deletedWg.Add(1)
        go func(p string, s int64) {
            defer deletedWg.Done()
            sem <- struct{}{}
            defer func() { <-sem }()
            _, err := pkg.RemoveDirectory(p)
            if err == nil {
                atomic.AddInt64(&totalDeleted, s)  // Atomic accumulation
            }
        }(folder.Path, folder.Size)
    }
    deletedWg.Wait()
}
Benefits:
  • Multiple folders deleted simultaneously
  • Atomic size tracking prevents race conditions
  • Failed deletions don’t block others

3. Smart Path Skipping

Pumu skips irrelevant directories to avoid wasting time (from scanner.go:26-31):
var ignoredPaths = map[string]bool{
    ".Trash": true, ".cache": true, ".npm": true, ".yarn": true,
    ".cargo": true, ".rustup": true, "Library": true, "AppData": true,
    "Local": true, "Roaming": true, ".vscode": true, ".idea": true,
}
Why skip these?
Skipped: Library, AppData, Local, RoamingReason:
  • macOS and Windows system directories
  • Contain OS-level caches, not project dependencies
  • Scanning them wastes time and may cause permission errors

4. Early Exit on Skip

When an ignored path is detected, Pumu uses filepath.SkipDir to avoid descending into subdirectories:
if d.IsDir() {
    if d.Name() == ".git" || isIgnoredPath(d.Name()) {
        return filepath.SkipDir  // Don't traverse children
    }
}
Performance impact: Scanning a home directory with .npm cache:
Without SkipDirWith SkipDir
50,000+ files scannedSkipped entirely
~30 seconds< 1 second

Ignored Paths Reference

From scanner.go:26-31, here’s the complete list:
PathCategoryReason
.TrashSystemmacOS trash folder
.cacheSystemGeneric cache directory
.npmPackage Managernpm global cache
.yarnPackage ManagerYarn global cache
.cargoPackage ManagerRust/Cargo global cache
.rustupPackage ManagerRust toolchain manager
LibrarySystemmacOS system libraries
AppDataSystemWindows application data
LocalSystemWindows local app data
RoamingSystemWindows roaming profiles
.vscodeIDEVS Code settings
.ideaIDEJetBrains IDE settings
.gitVCSGit repository (hardcoded check)
These paths are hardcoded in the source. If you need to scan them, you must modify scanner.go and recompile.

Best Practices for Large Directory Trees

1. Scan Specific Subdirectories

Avoid:
pumu sweep --path ~  # Scans entire home directory
Better:
pumu sweep --path ~/projects  # Scans only projects
Why: Even with smart skipping, scanning / or ~ wastes time on system folders.

2. Use Dry-Run First

pumu list --path ~/dev  # Preview what will be found
Benefits:
  • See results without waiting for deletion
  • Estimate cleanup time based on folder count
  • Verify no critical folders are targeted

3. Increase Concurrency for Fast Storage

If you have an NVMe SSD or network storage, you can modify the semaphore limit: Edit scanner.go:174 and scanner.go:220:
sem := make(chan struct{}, 50)  // Increase from 20 to 50
Recompile:
go build -o pumu
This is not recommended for HDDs or slower systems, as it may cause disk thrashing.

4. Exclude Specific Paths

Currently, Pumu doesn’t support custom exclusions via CLI. Workaround:
# Scan a subdirectory instead of the root
cd ~/projects/safe-to-scan
pumu sweep

Memory and CPU Considerations

Memory Usage

Per-folder memory:
  • TargetFolder struct: ~40 bytes (path string + int64 size)
  • 1,000 folders = ~40 KB
  • 10,000 folders = ~400 KB
Goroutine stack:
  • Each goroutine: ~2-8 KB initial stack size
  • Max 20 concurrent: ~160 KB
Total memory: ~1-5 MB for typical scans (< 10,000 folders).
Memory usage scales linearly:
  • 100,000 folders = ~4 MB for folder list
  • Goroutine overhead: ~160 KB (max 20 concurrent)
  • Total: ~5 MB
This is negligible on modern systems. CPU time becomes the bottleneck, not memory.

CPU Usage

During scanning:
  • Up to 20 cores saturated (if available)
  • Each goroutine performs disk I/O (CPU-bound on traversal)
During deletion:
  • Same concurrency (20 goroutines)
  • Mostly I/O-bound (disk write operations)
Recommended:
  • 4+ CPU cores for optimal performance
  • Works fine on 2 cores, just slower

Benchmarks

These are real-world benchmarks from the README:

Typical Scan Performance

ScenarioFolders FoundTime (Sequential)Time (Concurrent)Speedup
Small project5~2s~1s2x
Medium monorepo50~45s~5s9x
Large workspace200~300s~20s15x

Deletion Performance

Folder TypeSizeDelete Time (HDD)Delete Time (SSD)
node_modules500 MB~15s~3s
target2 GB~60s~10s
.venv200 MB~8s~2s
Deletion time depends more on file count than total size. A 1 GB folder with 100,000 small files takes longer to delete than a 5 GB folder with 10 large files.

Performance FAQs

Network drives have high latency per file operation:
  • Local SSD: ~0.1ms per file
  • NFS/SMB: ~5-50ms per file
Solution: Reduce concurrency to avoid overwhelming the network:
sem := make(chan struct{}, 5)  // Lower from 20 to 5
Yes, but benchmark first:
time pumu list  # Test with current settings
Modify scanner.go, recompile, and test again. If time decreases, keep the change.
Possible reasons:
  1. Many small files - More overhead per file
  2. HDD fragmentation - Slower random I/O
  3. Filesystem overhead - ext4/APFS have different deletion speeds
Solution: Use --dry-run to estimate before deleting.

See Also

Build docs developers (and LLMs) love