Koshās build pipeline is a carefully orchestrated sequence of stages designed for performance, correctness, and incremental builds.
Pipeline Overview
Stage 1: Setup & Validation
Build Lock Acquisition
Prevents concurrent builds that could corrupt the output:
func (b *Builder) Build(ctx context.Context) error {
// Acquire build lock to prevent concurrent builds
buildLock, lockErr := utils.AcquireBuildLock(b.cfg.OutputDir)
if lockErr != nil {
b.logger.Warn("Could not acquire build lock - another build may be in progress", "error", lockErr)
// Continue anyway - warn but don't block
} else {
defer func() { _ = buildLock.Release() }()
}
// ...
}
WASM Update Check
Runs in parallel to avoid blocking the critical path:
var setupWg sync.WaitGroup
setupWg.Add(1)
go func() {
defer setupWg.Done()
select {
case <-ctx.Done():
return
default:
b.checkWasmUpdate() // Compare hash of search engine source
}
}()
Cache Invalidation
Detects changes to global dependencies:
globalDependencies := []string{
filepath.Join(cfg.TemplateDir, "layout.html"),
filepath.Join(cfg.TemplateDir, "index.html"),
filepath.Join(cfg.StaticDir, "css/layout.css"),
"kosh.yaml",
}
var affectedPosts []string
for _, dep := range globalDependencies {
if info, err := os.Stat(dep); err == nil && info.ModTime().After(lastBuildTime) {
affected := b.invalidateForTemplate(dep)
if affected != nil {
affectedPosts = append(affectedPosts, affected...)
} else {
shouldForce = true // Force rebuild all
}
}
}
If a template changes, only posts using that template are rebuilt. If kosh.yaml changes, all posts are rebuilt.
Stage 2: Build Static Assets
Critical Order: Assets MUST complete before post rendering.
fmt.Println("š¦ Building assets...")
b.copyStaticAndBuildAssets(ctx)
Why Assets Come First
Templates reference assets using the Assets map:
<!-- Template uses hashed filenames -->
<link rel="stylesheet" href="{{ index .Assets "/static/css/layout.css" }}">
<!-- Rendered as: /static/css/layout-a1b2c3d4.css -->
The Assets map is populated during asset processing:
func (s *assetServiceImpl) Build(ctx context.Context) error {
var wg sync.WaitGroup
wg.Add(2)
// 1. Static Copy (images, fonts, etc.)
go func() {
defer wg.Done()
utils.CopyDirVFS(s.sourceFs, s.destFs, s.cfg.StaticDir, destStaticDir, ...)
}()
// 2. CSS/JS Bundling with esbuild
go func() {
defer wg.Done()
assets := make(map[string]string)
// Bundle CSS
cssHash := bundleCSS()
assets["/static/css/layout.css"] = "/static/css/layout-" + cssHash + ".css"
// Bundle JS
jsHash := bundleJS()
assets["/static/js/app.js"] = "/static/js/app-" + jsHash + ".js"
// Inject into render service
s.renderer.SetAssets(assets)
}()
wg.Wait() // MUST wait before posts can render
return nil
}
In v1.2.1, this was changed from parallel to synchronous to fix a race condition that caused CSS 404 errors on post pages.
Stage 3: Process Markdown Posts
The most complex stage, handling parsing, rendering, and indexing.
Fast Path: Cache Rehydration
If only templates changed (not content), rehydrate from cache:
isTemplateOnly := false // Detect template-only changes
if isTemplateOnly && cachedCount > 0 {
fmt.Println("š Rehydrating from cache...")
b.renderCachedPosts()
// Hydrate data for global pages from cache
ids, _ := b.cacheService.ListAllPosts()
cachedPosts, _ := b.cacheService.GetPostsByIDs(ids) // Batch fetch
searchRecords, _ := b.cacheService.GetSearchRecords(ids)
for _, cached := range cachedPosts {
post := models.PostMetadata{
Title: cached.Title,
Link: cached.Link,
// ... reconstruct from cache ...
}
allPosts = append(allPosts, post)
}
}
Fast Path Performance:
- No markdown parsing
- No HTML rendering
- No SSR (D2/KaTeX)
- Only template application
Slow Path: Full Processing
If content changed, parse and render:
fmt.Println("š Processing content...")
result, err := b.postService.Process(ctx, shouldForce, forceSocialRebuild, outputMissing)
allPosts = result.AllPosts
indexedPosts = result.IndexedPosts
Post Processing Pipeline
Worker Pool Parallelization
numWorkers := utils.GetDefaultWorkerCount() // CPU count
parsePool := utils.NewWorkerPool(ctx, numWorkers, func(task PostTask) {
// 1. Check cache
cachedMeta, _ := s.cache.GetPostByPath(task.path)
// 2. Validate cache with body hash
source, _ := afero.ReadFile(s.sourceFs, task.path)
bodyHash := utils.GetBodyHash(source)
if cachedMeta != nil && cachedMeta.BodyHash == bodyHash {
// Use cache
useCache = true
} else {
// Parse markdown
docNode := s.md.Parser().Parse(text.NewReader(source))
// Render HTML
buf := utils.SharedBufferPool.Get()
defer utils.SharedBufferPool.Put(buf)
s.md.Renderer().Render(buf, source, docNode)
htmlContent := buf.String()
// SSR processing
htmlContent = mdParser.ReplaceD2BlocksWithThemeSupport(htmlContent, d2Pairs)
htmlContent, mathHashes = mdParser.RenderMathForHTML(htmlContent, s.nativeRenderer, ...)
// Extract metadata
metaData := meta.Get(ctx)
toc := mdParser.GetTOC(ctx)
// Tokenize for search
words := search.DefaultAnalyzer.Analyze(plainText)
wordFreqs := make(map[string]int)
for _, w := range words {
wordFreqs[w]++
}
}
// Store in cache
newMeta := &cache.PostMeta{...}
s.cache.StoreHTMLForPost(newMeta, []byte(htmlContent))
})
parsePool.Start()
for _, file := range files {
parsePool.Submit(PostTask{path: file})
}
parsePool.Stop() // Wait for all workers
Social Card Generation
Runs in parallel with post processing:
cardPool := utils.NewWorkerPool(ctx, numWorkers, func(task socialCardTask) {
s.generateSocialCard(task)
})
cardPool.Start()
// Submit cards as posts are processed
for _, post := range posts {
if needsSocialCard(post) {
cardPool.Submit(socialCardTask{...})
}
}
cardPool.Stop() // Wait for all cards
Social card generation uses headless Chrome via Playwright to render Open Graph preview images.
Stage 4: Render Global Pages
Generate site-wide pages using aggregated post metadata.
func (b *Builder) renderPagination(allPosts, pinnedPosts []models.PostMetadata, shouldForce bool) {
perPage := 10
totalPages := (len(allPosts) + perPage - 1) / perPage
for page := 1; page <= totalPages; page++ {
start := (page - 1) * perPage
end := min(start+perPage, len(allPosts))
data := models.PageData{
Title: b.cfg.Title,
Posts: allPosts[start:end],
PinnedPosts: pinnedPosts, // Show on first page only
CurrentPage: page,
TotalPages: totalPages,
// ...
}
if page == 1 {
b.renderService.RenderIndex(filepath.Join(b.cfg.OutputDir, "index.html"), data)
} else {
path := filepath.Join(b.cfg.OutputDir, fmt.Sprintf("page-%d.html", page))
b.renderService.RenderIndex(path, data)
}
}
}
Tag Pages
func (b *Builder) renderTags(tagMap map[string][]models.PostMetadata, forceSocialRebuild bool) {
for tag, posts := range tagMap {
utils.SortPosts(posts) // Sort by date
data := models.PageData{
Title: "Posts tagged " + tag,
Posts: posts,
// ...
}
tagPath := filepath.Join(b.cfg.OutputDir, "tags", tag+".html")
b.renderService.RenderPage(tagPath, data)
}
}
Runs in parallel for optimal performance:
func (b *Builder) generateMetadata(allContent []models.PostMetadata, tagMap map[string][]models.PostMetadata, indexedPosts []models.IndexedPost, shouldForce bool) {
var genWg sync.WaitGroup
if b.cfg.Features.Generators.Sitemap {
genWg.Add(1)
go func() {
defer genWg.Done()
generators.GenerateSitemap(b.DestFs, b.cfg.BaseURL, allContent, tagMap, ...)
}()
}
if b.cfg.Features.Generators.RSS {
genWg.Add(1)
go func() {
defer genWg.Done()
generators.GenerateRSS(b.DestFs, b.cfg.BaseURL, allContent, ...)
}()
}
if b.cfg.Features.Generators.Search {
genWg.Add(1)
go func() {
defer genWg.Done()
generators.GenerateSearchIndex(b.DestFs, b.cfg.OutputDir, indexedPosts)
}()
}
if b.cfg.Features.Generators.Graph {
genWg.Add(1)
go func() {
defer genWg.Done()
generators.GenerateGraph(b.DestFs, b.cfg.BaseURL, allContent, ...)
}()
}
genWg.Wait() // Wait for all metadata generation
}
Metadata generators run in parallel because theyāre independent and I/O-bound.
Stage 5: PWA Generation
Generates Progressive Web App assets (only in production builds):
if cfg.Features.Generators.PWA {
setupWg.Add(1)
go func() {
defer setupWg.Done()
select {
case <-ctx.Done():
return
default:
fmt.Println("š± Generating PWA...")
b.generatePWA(shouldForce)
}
}()
}
PWA Components:
manifest.json: App metadata
service-worker.js: Offline caching
- Icon set: 192x192, 512x512 (generated from favicon)
Dev mode (kosh serve --dev) skips PWA generation for faster rebuilds.
Stage 6: Sync VFS to Disk
Flush the in-memory file system to disk using parallel workers:
fmt.Println("š¾ Syncing to disk...")
if err := utils.SyncVFS(b.DestFs, b.cfg.OutputDir, b.renderService.GetRenderedFiles()); err != nil {
b.logger.Error("Failed to sync VFS to disk", "error", err)
}
b.renderService.ClearRenderedFiles()
Parallel Sync Implementation
func SyncVFS(vfs afero.Fs, destDir string, renderedFiles map[string]bool) error {
numWorkers := runtime.NumCPU()
var files []string
afero.Walk(vfs, "/", func(path string, info os.FileInfo, err error) error {
if !info.IsDir() {
files = append(files, path)
}
return nil
})
pool := NewWorkerPool(context.Background(), numWorkers, func(vfsPath string) {
diskPath := filepath.Join(destDir, vfsPath)
// Read from VFS
data, _ := afero.ReadFile(vfs, vfsPath)
// Write to disk
os.MkdirAll(filepath.Dir(diskPath), 0755)
os.WriteFile(diskPath, data, 0644)
})
pool.Start()
for _, file := range files {
pool.Submit(file)
}
pool.Stop()
return nil
}
Why VFS?
- Atomic builds (all-or-nothing)
- Fast file operations (memory speed)
- Incremental sync (only changed files written)
- Crash-safe (disk never in partial state)
Stage 7: Cleanup & Metrics
Save Caches
defer b.SaveCaches()
func (b *Builder) SaveCaches() {
// Flush diagram adapter to BoltDB
if b.diagramAdapter != nil {
_ = b.diagramAdapter.Close()
}
// Increment build count
if b.cacheService != nil {
_ = b.cacheService.IncrementBuildCount()
}
// Record end time and print metrics
b.metrics.RecordEnd()
if !b.cfg.IsDev {
b.metrics.Print() // "š Built N posts in Xs (cache: H/M hits, P%)"
}
}
Close Resources
defer b.Close()
func (b *Builder) Close() {
if b.cacheService != nil {
_ = b.cacheService.Close() // Close BoltDB
}
}
Context Cancellation
All stages respect context cancellation for graceful shutdown:
select {
case <-ctx.Done():
b.logger.Info("Build cancelled", "reason", ctx.Err())
return ctx.Err()
default:
// Continue building
}
Signal Handling:
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
sigChan := make(chan os.Signal, 1)
signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM)
go func() {
<-sigChan
fmt.Println("\nš Shutdown signal received, cleaning up...")
cancel() // Trigger context cancellation
time.Sleep(5 * time.Second) // Grace period
os.Exit(1)
}()
Critical Path Ordering
| Stage | Blocks | Reason |
|---|
| Assets | Posts | Templates need Assets map |
| Posts | Global | Global pages need post metadata |
| Global | PWA | PWA needs all pages for offline caching |
| PWA | Sync | Sync needs all generated files |
Parallel Opportunities
| Operation | Parallelization |
|---|
| Post parsing | Worker pool (CPU cores) |
| Social cards | Worker pool (I/O bound) |
| Asset copying | Worker pool (I/O bound) |
| Metadata generation | Goroutines (independent) |
| VFS sync | Worker pool (I/O bound) |
Cache Hit Optimization
Optimize for the common case (no changes):
- Fast path detection: Check if only templates changed
- Batch reads:
GetPostsByIDs() instead of N Ć GetPost()
- In-memory LRU: Cache hot
PostMeta for pagination
- Pre-computed fields: Store normalized strings in cache
Next Steps