Kosh implements incremental builds using a BoltDB cache and content-addressed storage. Only changed files are reprocessed, making rebuilds 10-100x faster than full builds.
Architecture
The cache system consists of three layers:
builder/cache/
├── cache.go # BoltDB manager with LRU cache
├── cache_reads.go # Generic read operations
├── cache_writes.go # Batch write operations
├── types.go # Data structures (PostMeta, SearchRecord)
└── store.go # Content-addressed filesystem
Storage Design
.kosh-cache/
├── meta.db # BoltDB database
│ ├── posts # PostMeta indexed by PostID
│ ├── paths # Path → PostID mapping
│ ├── search # Pre-computed BM25 data
│ ├── deps # Template/include dependencies
│ └── ssr_artifacts # D2 diagrams, KaTeX math
└── store/ # Content-addressed files
├── 3a/f2... # Large HTML (by BLAKE3 hash)
└── d4/1c... # SSR outputs (diagrams, math)
Cache Invalidation
Kosh uses hash-based invalidation to detect changes:
Body Hash Tracking (v1.2.1)
Previously, Kosh only hashed frontmatter, causing silent cache hits when body content changed. v1.2.1 introduced separate body hashing.
From builder/cache/types.go:16-25:
type PostMeta struct {
PostID string
ContentHash string // Frontmatter hash
BodyHash string // Body content hash (CRITICAL)
HTMLHash string // For large posts
InlineHTML []byte // < 32KB posts stored inline
TemplateHash string
SSRInputHashes []string // D2/LaTeX input hashes
// ...
}
Invalidation logic:
// Both hashes must match for cache hit
if cached.ContentHash == newContentHash && cached.BodyHash == newBodyHash {
return cached // Cache hit
}
// Otherwise, re-render
Before v1.2.1, changing only the body content (without touching frontmatter) would incorrectly reuse cached HTML. This critical bug is now fixed.
SSR Hash Tracking
D2 diagrams and LaTeX math are cached separately and tracked in SSRInputHashes:
type SSRArtifact struct {
Type string // "d2" or "katex"
InputHash string // BLAKE3 of source code
OutputHash string // BLAKE3 of rendered output
RefCount int
Size int64
Compressed bool // Zstd compression
}
Example: If you change a D2 diagram from A -> B to A -> C, only that diagram re-renders—the rest of the post uses cached HTML.
In-Memory LRU Cache
To reduce BoltDB reads, v1.2.1 added an LRU (Least Recently Used) cache for hot data.
From builder/cache/cache.go:21-42:
type Manager struct {
db *bolt.DB
memCache map[string]*memoryCacheEntry
memCacheMu sync.RWMutex
memCacheTTL time.Duration // Default: 5 minutes
}
type memoryCacheEntry struct {
meta *PostMeta
expiresAt time.Time
}
Cache Lookup Flow
From builder/cache/cache_reads.go:72-108:
func (m *Manager) GetPostByPath(path string) (*PostMeta, error) {
// 1. Check in-memory cache first
if cached := m.memCacheGet("path:" + normalizedPath); cached != nil {
return cached, nil // Fast path: ~10ns
}
// 2. Fall back to BoltDB
var result *PostMeta
err := m.db.View(func(tx *bolt.Tx) error {
paths := tx.Bucket([]byte(BucketPaths))
postID := paths.Get([]byte(normalizedPath))
posts := tx.Bucket([]byte(BucketPosts))
data := posts.Get(postID)
Decode(data, &result)
return nil
})
// 3. Store in memory cache for next lookup
if result != nil {
m.memCacheSet("path:"+normalizedPath, result)
}
return result, err
}
Performance impact: Frequently accessed posts (like index pages) see ~100x faster lookups after the first read.
The 5-minute TTL ensures the cache stays fresh during watch mode. Cache entries are automatically evicted on writes.
Generic Cache Operations
Kosh uses Go 1.18+ generics for type-safe cache reads (from builder/cache/cache_reads.go:13-33):
func getCachedItem[T any](db *bolt.DB, bucketName string, key []byte) (*T, error) {
var result *T
err := db.View(func(tx *bolt.Tx) error {
bucket := tx.Bucket([]byte(bucketName))
data := bucket.Get(key)
if data == nil {
return nil
}
var item T
if err := Decode(data, &item); err != nil {
return err
}
result = &item
return nil
})
return result, err
}
Usage:
// Type-safe, no casting needed
post, err := getCachedItem[PostMeta](db, BucketPosts, postID)
search, err := getCachedItem[SearchRecord](db, BucketSearch, postID)
Batch Writes
To maximize throughput, Kosh uses object pooling and batch commits (from builder/cache/cache.go:15-19):
var encodedPostPool = sync.Pool{
New: func() interface{} {
return make([]EncodedPost, 0, 64)
},
}
Batch Commit Pattern
func (m *Manager) CommitBatch(posts []EncodedPost) error {
return m.db.Update(func(tx *bolt.Tx) error {
postsB := tx.Bucket([]byte(BucketPosts))
pathsB := tx.Bucket([]byte(BucketPaths))
searchB := tx.Bucket([]byte(BucketSearch))
for _, p := range posts {
postsB.Put(p.PostID, p.Data)
pathsB.Put(p.Path, p.PostID)
searchB.Put(p.PostID, p.SearchData)
}
return nil
})
}
Why batch? BoltDB fsync is expensive (~5ms). Batching 100 posts reduces fsync from 100x to 1x = 500ms saved.
Content-Addressed Storage
Large HTML (>32KB) is stored in the filesystem by BLAKE3 hash:
const InlineHTMLThreshold = 32 * 1024 // 32KB
if len(html) < InlineHTMLThreshold {
meta.InlineHTML = html // Store in BoltDB
} else {
hash := cache.HashContent(html)
store.Write(hash, html) // Write to .kosh-cache/store/
meta.HTMLHash = hash
}
Directory structure:
.kosh-cache/store/
├── 3a/f2d8e1... # First 2 chars = subdir
└── d4/1c9b7a...
This prevents BoltDB bloat and enables efficient deduplication.
Watch Mode
In dev mode, Kosh watches for changes and performs surgical rebuilds.
From builder/run/incremental.go:60-95:
func (b *Builder) BuildChanged(ctx context.Context, changedPath string, op fsnotify.Op) {
// Handle deletions
if op&fsnotify.Remove == fsnotify.Remove {
b.deletePostFromCache(changedPath)
b.Build(ctx) // Full rebuild to update indexes
return
}
// Handle markdown changes - single post rebuild
if strings.HasSuffix(changedPath, ".md") {
b.buildSinglePost(ctx, changedPath)
utils.SyncVFS(b.DestFs, b.cfg.OutputDir, b.renderService.GetRenderedFiles())
return
}
// Handle CSS/JS changes - full rebuild to update asset hashes
if ext == ".css" || ext == ".js" {
b.Build(ctx) // Asset hashes affect HTML
return
}
// Handle template changes - invalidate affected posts
if strings.HasSuffix(changedPath, ".html") {
affectedPaths := b.invalidateForTemplate(changedPath)
if affectedPaths == nil {
b.Build(ctx) // Layout.html changed
} else {
for _, path := range affectedPaths {
b.buildSinglePost(ctx, path)
}
}
}
}
Rebuild Strategies
| Change Type | Strategy | Speed |
|---|
Single .md file | Re-render only that post | ~50ms |
| CSS/JS file | Full rebuild (asset hashes) | ~500ms |
| Template | Rebuild posts using template | ~200ms |
layout.html | Full rebuild | ~500ms |
| Config | Full rebuild | ~500ms |
Watch mode automatically debounces rapid changes to prevent rebuild storms. Only one build runs at a time.
Cache Management Commands
# Show cache statistics
kosh cache stats
# Verify cache integrity
kosh cache verify
# Run garbage collection (remove orphaned content)
kosh cache gc
# Dry run (show what would be deleted)
kosh cache gc --dry-run
# Force full rebuild (clear cache)
kosh cache rebuild
# Delete all cache data
kosh cache clear
# Inspect a specific file's cache entry
kosh cache inspect content/posts/my-post.md
100-post documentation site:
| Operation | Cold (no cache) | Warm (cached) | Speedup |
|---|
| Full build | 2.5s | 250ms | 10x |
| Single post | 50ms | 10ms | 5x |
| Watch rebuild | 500ms | 50ms | 10x |
500-post blog:
| Operation | Cold | Warm | Speedup |
|---|
| Full build | 15s | 800ms | 18x |
| Single post | 80ms | 15ms | 5x |
Incremental builds shine during development. The first build after git clone is slow, but subsequent builds are nearly instant.
BoltDB Configuration
Kosh optimizes BoltDB for SSG workloads (from builder/cache/cache.go:50-79):
opts := &bolt.Options{
Timeout: 10 * time.Second,
FreelistType: bolt.FreelistArrayType, // Faster than map
PageSize: 16384, // 16KB pages
InitialMmapSize: calculatedSize, // Pre-allocate based on existing DB
}
if isDev {
opts.NoGrowSync = true // Faster, less durable (okay for dev)
} else {
opts.NoGrowSync = false // Production: full durability
}
Key optimizations:
- Array freelist: Faster for SSG’s write-heavy workload
- 16KB pages: Matches typical page size (vs 4KB default)
- Dynamic mmap: Grows to 2x current DB size (max 100MB)
- Dev mode: Skips fsync on grow operations
Cache ID Verification
Kosh stores a cache ID to detect configuration changes:
func (m *Manager) VerifyCacheID(expectedID string) (needsRebuild bool, err error) {
var storedID []byte
m.db.View(func(tx *bolt.Tx) error {
meta := tx.Bucket([]byte(BucketMeta))
storedID = meta.Get([]byte(KeyCacheID))
return nil
})
if storedID == nil || string(storedID) != expectedID {
return true, nil // Rebuild needed
}
return false, nil
}
What triggers cache invalidation:
- Theme change
- Output directory change
- Base URL change (if affecting rendered HTML)
- Major version upgrade