Build Pipeline

Kosh’s build pipeline is a carefully orchestrated sequence of stages designed for performance, correctness, and incremental builds.

Pipeline Overview

Stage 1: Setup & Validation

Build Lock Acquisition

Prevents concurrent builds that could corrupt the output:

func (b *Builder) Build(ctx context.Context) error {
    // Acquire build lock to prevent concurrent builds
    buildLock, lockErr := utils.AcquireBuildLock(b.cfg.OutputDir)
    if lockErr != nil {
        b.logger.Warn("Could not acquire build lock - another build may be in progress", "error", lockErr)
        // Continue anyway - warn but don't block
    } else {
        defer func() { _ = buildLock.Release() }()
    }
    
    // ...
}

WASM Update Check

Runs in parallel to avoid blocking the critical path:

var setupWg sync.WaitGroup
setupWg.Add(1)
go func() {
    defer setupWg.Done()
    select {
    case &lt;-ctx.Done():
        return
    default:
        b.checkWasmUpdate()  // Compare hash of search engine source
    }
}()

Cache Invalidation

Detects changes to global dependencies:

globalDependencies := []string{
    filepath.Join(cfg.TemplateDir, "layout.html"),
    filepath.Join(cfg.TemplateDir, "index.html"),
    filepath.Join(cfg.StaticDir, "css/layout.css"),
    "kosh.yaml",
}

var affectedPosts []string
for _, dep := range globalDependencies {
    if info, err := os.Stat(dep); err == nil && info.ModTime().After(lastBuildTime) {
        affected := b.invalidateForTemplate(dep)
        if affected != nil {
            affectedPosts = append(affectedPosts, affected...)
        } else {
            shouldForce = true  // Force rebuild all
        }
    }
}

If a template changes, only posts using that template are rebuilt. If kosh.yaml changes, all posts are rebuilt.

Stage 2: Build Static Assets

Critical Order: Assets MUST complete before post rendering.

fmt.Println("📦 Building assets...")
b.copyStaticAndBuildAssets(ctx)

Why Assets Come First

Templates reference assets using the Assets map:

&lt;!-- Template uses hashed filenames -->
&lt;link rel="stylesheet" href="{{ index .Assets "/static/css/layout.css" }}">
&lt;!-- Rendered as: /static/css/layout-a1b2c3d4.css -->

The Assets map is populated during asset processing:

func (s *assetServiceImpl) Build(ctx context.Context) error {
    var wg sync.WaitGroup
    wg.Add(2)
    
    // 1. Static Copy (images, fonts, etc.)
    go func() {
        defer wg.Done()
        utils.CopyDirVFS(s.sourceFs, s.destFs, s.cfg.StaticDir, destStaticDir, ...)
    }()
    
    // 2. CSS/JS Bundling with esbuild
    go func() {
        defer wg.Done()
        assets := make(map[string]string)
        
        // Bundle CSS
        cssHash := bundleCSS()
        assets["/static/css/layout.css"] = "/static/css/layout-" + cssHash + ".css"
        
        // Bundle JS
        jsHash := bundleJS()
        assets["/static/js/app.js"] = "/static/js/app-" + jsHash + ".js"
        
        // Inject into render service
        s.renderer.SetAssets(assets)
    }()
    
    wg.Wait()  // MUST wait before posts can render
    return nil
}

In v1.2.1, this was changed from parallel to synchronous to fix a race condition that caused CSS 404 errors on post pages.

Stage 3: Process Markdown Posts

The most complex stage, handling parsing, rendering, and indexing.

Fast Path: Cache Rehydration

If only templates changed (not content), rehydrate from cache:

isTemplateOnly := false  // Detect template-only changes
if isTemplateOnly && cachedCount > 0 {
    fmt.Println("📝 Rehydrating from cache...")
    b.renderCachedPosts()
    
    // Hydrate data for global pages from cache
    ids, _ := b.cacheService.ListAllPosts()
    cachedPosts, _ := b.cacheService.GetPostsByIDs(ids)  // Batch fetch
    searchRecords, _ := b.cacheService.GetSearchRecords(ids)
    
    for _, cached := range cachedPosts {
        post := models.PostMetadata{
            Title: cached.Title,
            Link:  cached.Link,
            // ... reconstruct from cache ...
        }
        allPosts = append(allPosts, post)
    }
}

Fast Path Performance:

No markdown parsing
No HTML rendering
No SSR (D2/KaTeX)
Only template application

Slow Path: Full Processing

If content changed, parse and render:

fmt.Println("📝 Processing content...")
result, err := b.postService.Process(ctx, shouldForce, forceSocialRebuild, outputMissing)
allPosts = result.AllPosts
indexedPosts = result.IndexedPosts

Post Processing Pipeline

Worker Pool Parallelization

numWorkers := utils.GetDefaultWorkerCount()  // CPU count

parsePool := utils.NewWorkerPool(ctx, numWorkers, func(task PostTask) {
    // 1. Check cache
    cachedMeta, _ := s.cache.GetPostByPath(task.path)
    
    // 2. Validate cache with body hash
    source, _ := afero.ReadFile(s.sourceFs, task.path)
    bodyHash := utils.GetBodyHash(source)
    
    if cachedMeta != nil && cachedMeta.BodyHash == bodyHash {
        // Use cache
        useCache = true
    } else {
        // Parse markdown
        docNode := s.md.Parser().Parse(text.NewReader(source))
        
        // Render HTML
        buf := utils.SharedBufferPool.Get()
        defer utils.SharedBufferPool.Put(buf)
        s.md.Renderer().Render(buf, source, docNode)
        htmlContent := buf.String()
        
        // SSR processing
        htmlContent = mdParser.ReplaceD2BlocksWithThemeSupport(htmlContent, d2Pairs)
        htmlContent, mathHashes = mdParser.RenderMathForHTML(htmlContent, s.nativeRenderer, ...)
        
        // Extract metadata
        metaData := meta.Get(ctx)
        toc := mdParser.GetTOC(ctx)
        
        // Tokenize for search
        words := search.DefaultAnalyzer.Analyze(plainText)
        wordFreqs := make(map[string]int)
        for _, w := range words {
            wordFreqs[w]++
        }
    }
    
    // Store in cache
    newMeta := &cache.PostMeta{...}
    s.cache.StoreHTMLForPost(newMeta, []byte(htmlContent))
})

parsePool.Start()
for _, file := range files {
    parsePool.Submit(PostTask{path: file})
}
parsePool.Stop()  // Wait for all workers

Runs in parallel with post processing:

cardPool := utils.NewWorkerPool(ctx, numWorkers, func(task socialCardTask) {
    s.generateSocialCard(task)
})
cardPool.Start()

// Submit cards as posts are processed
for _, post := range posts {
    if needsSocialCard(post) {
        cardPool.Submit(socialCardTask{...})
    }
}

cardPool.Stop()  // Wait for all cards

Social card generation uses headless Chrome via Playwright to render Open Graph preview images.

Stage 4: Render Global Pages

Generate site-wide pages using aggregated post metadata.

Pagination

func (b *Builder) renderPagination(allPosts, pinnedPosts []models.PostMetadata, shouldForce bool) {
    perPage := 10
    totalPages := (len(allPosts) + perPage - 1) / perPage
    
    for page := 1; page &lt;= totalPages; page++ {
        start := (page - 1) * perPage
        end := min(start+perPage, len(allPosts))
        
        data := models.PageData{
            Title:       b.cfg.Title,
            Posts:       allPosts[start:end],
            PinnedPosts: pinnedPosts,  // Show on first page only
            CurrentPage: page,
            TotalPages:  totalPages,
            // ...
        }
        
        if page == 1 {
            b.renderService.RenderIndex(filepath.Join(b.cfg.OutputDir, "index.html"), data)
        } else {
            path := filepath.Join(b.cfg.OutputDir, fmt.Sprintf("page-%d.html", page))
            b.renderService.RenderIndex(path, data)
        }
    }
}

Tag Pages

func (b *Builder) renderTags(tagMap map[string][]models.PostMetadata, forceSocialRebuild bool) {
    for tag, posts := range tagMap {
        utils.SortPosts(posts)  // Sort by date
        
        data := models.PageData{
            Title: "Posts tagged " + tag,
            Posts: posts,
            // ...
        }
        
        tagPath := filepath.Join(b.cfg.OutputDir, "tags", tag+".html")
        b.renderService.RenderPage(tagPath, data)
    }
}

Metadata Generation

Runs in parallel for optimal performance:

func (b *Builder) generateMetadata(allContent []models.PostMetadata, tagMap map[string][]models.PostMetadata, indexedPosts []models.IndexedPost, shouldForce bool) {
    var genWg sync.WaitGroup
    
    if b.cfg.Features.Generators.Sitemap {
        genWg.Add(1)
        go func() {
            defer genWg.Done()
            generators.GenerateSitemap(b.DestFs, b.cfg.BaseURL, allContent, tagMap, ...)
        }()
    }
    
    if b.cfg.Features.Generators.RSS {
        genWg.Add(1)
        go func() {
            defer genWg.Done()
            generators.GenerateRSS(b.DestFs, b.cfg.BaseURL, allContent, ...)
        }()
    }
    
    if b.cfg.Features.Generators.Search {
        genWg.Add(1)
        go func() {
            defer genWg.Done()
            generators.GenerateSearchIndex(b.DestFs, b.cfg.OutputDir, indexedPosts)
        }()
    }
    
    if b.cfg.Features.Generators.Graph {
        genWg.Add(1)
        go func() {
            defer genWg.Done()
            generators.GenerateGraph(b.DestFs, b.cfg.BaseURL, allContent, ...)
        }()
    }
    
    genWg.Wait()  // Wait for all metadata generation
}

Metadata generators run in parallel because they’re independent and I/O-bound.

Stage 5: PWA Generation

Generates Progressive Web App assets (only in production builds):

if cfg.Features.Generators.PWA {
    setupWg.Add(1)
    go func() {
        defer setupWg.Done()
        select {
        case &lt;-ctx.Done():
            return
        default:
            fmt.Println("📱 Generating PWA...")
            b.generatePWA(shouldForce)
        }
    }()
}

PWA Components:

manifest.json: App metadata
service-worker.js: Offline caching
Icon set: 192x192, 512x512 (generated from favicon)

Dev mode (kosh serve --dev) skips PWA generation for faster rebuilds.

Stage 6: Sync VFS to Disk

Flush the in-memory file system to disk using parallel workers:

fmt.Println("💾 Syncing to disk...")
if err := utils.SyncVFS(b.DestFs, b.cfg.OutputDir, b.renderService.GetRenderedFiles()); err != nil {
    b.logger.Error("Failed to sync VFS to disk", "error", err)
}
b.renderService.ClearRenderedFiles()

Parallel Sync Implementation

func SyncVFS(vfs afero.Fs, destDir string, renderedFiles map[string]bool) error {
    numWorkers := runtime.NumCPU()
    
    var files []string
    afero.Walk(vfs, "/", func(path string, info os.FileInfo, err error) error {
        if !info.IsDir() {
            files = append(files, path)
        }
        return nil
    })
    
    pool := NewWorkerPool(context.Background(), numWorkers, func(vfsPath string) {
        diskPath := filepath.Join(destDir, vfsPath)
        
        // Read from VFS
        data, _ := afero.ReadFile(vfs, vfsPath)
        
        // Write to disk
        os.MkdirAll(filepath.Dir(diskPath), 0755)
        os.WriteFile(diskPath, data, 0644)
    })
    
    pool.Start()
    for _, file := range files {
        pool.Submit(file)
    }
    pool.Stop()
    
    return nil
}

Why VFS?

Atomic builds (all-or-nothing)
Fast file operations (memory speed)
Incremental sync (only changed files written)
Crash-safe (disk never in partial state)

Stage 7: Cleanup & Metrics

Save Caches

defer b.SaveCaches()

func (b *Builder) SaveCaches() {
    // Flush diagram adapter to BoltDB
    if b.diagramAdapter != nil {
        _ = b.diagramAdapter.Close()
    }
    
    // Increment build count
    if b.cacheService != nil {
        _ = b.cacheService.IncrementBuildCount()
    }
    
    // Record end time and print metrics
    b.metrics.RecordEnd()
    if !b.cfg.IsDev {
        b.metrics.Print()  // "📊 Built N posts in Xs (cache: H/M hits, P%)"
    }
}

Close Resources

defer b.Close()

func (b *Builder) Close() {
    if b.cacheService != nil {
        _ = b.cacheService.Close()  // Close BoltDB
    }
}

Context Cancellation

All stages respect context cancellation for graceful shutdown:

select {
case &lt;-ctx.Done():
    b.logger.Info("Build cancelled", "reason", ctx.Err())
    return ctx.Err()
default:
    // Continue building
}

Signal Handling:

ctx, cancel := context.WithCancel(context.Background())
defer cancel()

sigChan := make(chan os.Signal, 1)
signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM)

go func() {
    &lt;-sigChan
    fmt.Println("\n🛑 Shutdown signal received, cleaning up...")
    cancel()  // Trigger context cancellation
    
    time.Sleep(5 * time.Second)  // Grace period
    os.Exit(1)
}()

Performance Optimizations

Critical Path Ordering

Stage	Blocks	Reason
Assets	Posts	Templates need `Assets` map
Posts	Global	Global pages need post metadata
Global	PWA	PWA needs all pages for offline caching
PWA	Sync	Sync needs all generated files

Parallel Opportunities

Operation	Parallelization
Post parsing	Worker pool (CPU cores)
Social cards	Worker pool (I/O bound)
Asset copying	Worker pool (I/O bound)
Metadata generation	Goroutines (independent)
VFS sync	Worker pool (I/O bound)

Cache Hit Optimization

Optimize for the common case (no changes):

Fast path detection: Check if only templates changed
Batch reads: GetPostsByIDs() instead of N × GetPost()
In-memory LRU: Cache hot PostMeta for pagination
Pre-computed fields: Store normalized strings in cache

Next Steps

Deep dive into Cache System
Understand Service Layer responsibilities
Review Architecture Overview

Get Started

Core Concepts

Usage

Features

Development

Build Pipeline

Pipeline Overview

Stage 1: Setup & Validation

Build Lock Acquisition

WASM Update Check

Cache Invalidation

Stage 2: Build Static Assets

Why Assets Come First

Stage 3: Process Markdown Posts

Fast Path: Cache Rehydration

Slow Path: Full Processing

Post Processing Pipeline

Worker Pool Parallelization

Stage 4: Render Global Pages

Tag Pages

Metadata Generation

Stage 5: PWA Generation

Stage 6: Sync VFS to Disk

Parallel Sync Implementation

Stage 7: Cleanup & Metrics

Save Caches

Close Resources

Context Cancellation

Performance Optimizations

Critical Path Ordering

Parallel Opportunities

Cache Hit Optimization

Next Steps

Build docs developers (and LLMs) love

Get Started

Core Concepts

Usage

Features

Development

​Pipeline Overview

​Stage 1: Setup & Validation

​Build Lock Acquisition

​WASM Update Check

​Cache Invalidation

​Stage 2: Build Static Assets

​Why Assets Come First

​Stage 3: Process Markdown Posts

​Fast Path: Cache Rehydration

​Slow Path: Full Processing

​Post Processing Pipeline

​Worker Pool Parallelization

​Social Card Generation

​Stage 4: Render Global Pages

​Pagination

​Tag Pages

​Metadata Generation

​Stage 5: PWA Generation

​Stage 6: Sync VFS to Disk

​Parallel Sync Implementation

​Stage 7: Cleanup & Metrics

​Save Caches

​Close Resources

​Context Cancellation

​Performance Optimizations

​Critical Path Ordering

​Parallel Opportunities

​Cache Hit Optimization

​Next Steps

Build docs developers (and LLMs) love

Pipeline Overview

Stage 1: Setup & Validation

Build Lock Acquisition

WASM Update Check

Cache Invalidation

Stage 2: Build Static Assets

Why Assets Come First

Stage 3: Process Markdown Posts

Fast Path: Cache Rehydration

Slow Path: Full Processing

Post Processing Pipeline

Worker Pool Parallelization

Social Card Generation

Stage 4: Render Global Pages

Pagination

Tag Pages

Metadata Generation

Stage 5: PWA Generation

Stage 6: Sync VFS to Disk

Parallel Sync Implementation

Stage 7: Cleanup & Metrics

Save Caches

Close Resources

Context Cancellation

Performance Optimizations

Critical Path Ordering

Parallel Opportunities

Cache Hit Optimization

Next Steps