Kosh is built for high performance. Follow these guidelines to maintain and improve performance.
Memory Management
Kosh uses object pooling extensively to reduce garbage collection pressure during high-throughput builds.
Buffer Pooling
Use SharedBufferPool for temporary byte buffers:
// BufferPool manages reusable bytes.Buffer objects
type BufferPool struct {
pool sync.Pool
}
func NewBufferPool() *BufferPool {
return &BufferPool{
pool: sync.Pool{
New: func() interface{} {
return new(bytes.Buffer)
},
},
}
}
func (p *BufferPool) Get() *bytes.Buffer {
return p.pool.Get().(*bytes.Buffer)
}
func (p *BufferPool) Put(buf *bytes.Buffer) {
if buf.Cap() > MaxBufferSize {
return // Discard oversized buffers
}
buf.Reset()
p.pool.Put(buf)
}
Usage pattern:
// Get buffer from pool
buf := utils.SharedBufferPool.Get()
defer utils.SharedBufferPool.Put(buf)
// Use buffer
buf.WriteString("content")
result := buf.String()
Always use defer to return buffers to the pool, even if an error occurs.
String Building
Use strings.Builder for string concatenation instead of + operator:
// Good - efficient
var sb strings.Builder
for _, word := range words {
sb.WriteString(word)
sb.WriteString(" ")
}
result := sb.String()
// Bad - creates intermediate strings
var result string
for _, word := range words {
result += word + " "
}
Slice Pre-allocation
Pre-allocate slices when you know the size:
// Good - pre-allocate
posts := make([]models.PostMetadata, 0, expectedCount)
for _, file := range files {
posts = append(posts, processFile(file))
}
// Bad - grows dynamically
var posts []models.PostMetadata
for _, file := range files {
posts = append(posts, processFile(file))
}
Pool for Encoded Posts
// Pool for batch BoltDB operations
var encodedPostPool = sync.Pool{
New: func() interface{} {
return make([]EncodedPost, 0, 64)
},
}
// Usage
func (m *Manager) batchWrite(posts []*PostMeta) error {
encoded := encodedPostPool.Get().([]EncodedPost)
defer func() {
encoded = encoded[:0]
encodedPostPool.Put(encoded)
}()
// Use encoded slice...
}
Concurrency Patterns
Worker Pools
Use the generic WorkerPool[T] for concurrent operations:
builder/utils/worker_pool.go
type WorkerPool[T any] struct {
workers int
ctx context.Context
wg sync.WaitGroup
taskQueue chan T
handler func(T)
}
func NewWorkerPool[T any](ctx context.Context, workers int, handler func(T)) *WorkerPool[T] {
if workers <= 0 {
workers = runtime.NumCPU()
}
if workers > MaxWorkers {
workers = MaxWorkers
}
return &WorkerPool[T]{
workers: workers,
ctx: ctx,
taskQueue: make(chan T, workers*WorkerBufferSize),
handler: handler,
}
}
Usage example:
ctx := context.Background()
// Create pool with 4 workers
pool := utils.NewWorkerPool(ctx, 4, func(path string) {
processMarkdownFile(path)
})
pool.Start()
// Submit tasks
for _, path := range markdownFiles {
pool.Submit(path)
}
// Wait for completion
pool.Stop()
Atomic Operations
Use atomic operations for counters in concurrent code:
var (
processedCount int32
anyChanged atomic.Bool
)
// In worker goroutine
atomic.AddInt32(&processedCount, 1)
if changed {
anyChanged.Store(true)
}
// Read final values
total := atomic.LoadInt32(&processedCount)
hasChanges := anyChanged.Load()
Mutex vs RWMutex
Use sync.RWMutex when reads are more frequent than writes:
type Cache struct {
data map[string]*Entry
mu sync.RWMutex
}
// Many readers can access concurrently
func (c *Cache) Get(key string) *Entry {
c.mu.RLock()
defer c.mu.RUnlock()
return c.data[key]
}
// Writers get exclusive access
func (c *Cache) Set(key string, entry *Entry) {
c.mu.Lock()
defer c.mu.Unlock()
c.data[key] = entry
}
Cache Optimization
In-Memory LRU Cache
Kosh uses an in-memory LRU cache for hot PostMeta data:
type memoryCacheEntry struct {
meta *PostMeta
expiresAt time.Time
}
type Manager struct {
db *bolt.DB
memCache map[string]*memoryCacheEntry
memCacheMu sync.RWMutex
memCacheTTL time.Duration
}
const defaultMemCacheTTL = 5 * time.Minute
Benefits:
- Reduces BoltDB reads for frequently accessed posts
- 5-minute TTL ensures fresh data
- Thread-safe with RWMutex
Content-Addressed Storage
// Small content stored inline (< 32KB)
if len(html) < 32*1024 {
post.InlineHTML = html
} else {
// Large content stored by hash
hash := hashContent(html)
storeContent(hash, html)
post.ContentHash = hash
}
Benefits:
- Avoids duplicate storage of identical content
- Single I/O for small posts
- Deduplication for large content
Batch Operations
Group database writes for better throughput:
// Bad - multiple transactions
for _, post := range posts {
db.Update(func(tx *bolt.Tx) error {
return tx.Bucket("posts").Put(post.ID, post.Data)
})
}
// Good - single transaction
db.Update(func(tx *bolt.Tx) error {
b := tx.Bucket("posts")
for _, post := range posts {
if err := b.Put(post.ID, post.Data); err != nil {
return err
}
}
return nil
})
Body Hash Caching
Kosh caches body content hash separately from frontmatter (v1.2.1):
type PostMeta struct {
ID string
Title string
BodyHash string // Hash of body content only
SSRInputHashes map[string]string // D2/LaTeX hashes
}
Benefits:
- Accurate cache invalidation on body-only changes
- Prevents silent cache misses
- Tracks server-side rendering dependencies
Build Pipeline Optimization
Build Order (Critical)
Static assets MUST complete before post rendering because templates use the Assets map.
The build pipeline enforces this order:
Static assets build
Populates Assets map via SetAssets():assets := make(map[string]string)
assets["/static/css/layout.css"] = "/static/css/layout.abc123.css"
renderService.SetAssets(assets)
Posts render
Templates use asset map:<link rel="stylesheet" href="{{ index .Assets "/static/css/layout.css" }}">
Global pages render
Same asset references as posts.
PWA generation
Uses GetAssets() for manifest and service worker.
Pre-computed Fields
Store normalized data to avoid runtime computation:
// Bad - normalize at query time
type SearchRecord struct {
Title string
Body string
}
func search(query string) {
normalizedQuery := strings.ToLower(query)
for _, record := range records {
if strings.Contains(strings.ToLower(record.Title), normalizedQuery) {
// Match
}
}
}
// Good - pre-compute normalized strings
type SearchRecord struct {
Title string
TitleNormalized string // Pre-computed at index time
Body string
BodyNormalized string // Pre-computed at index time
}
func search(query string) {
normalizedQuery := strings.ToLower(query)
for _, record := range records {
if strings.Contains(record.TitleNormalized, normalizedQuery) {
// Match - no runtime normalization
}
}
}
Stemming Cache
Kosh caches stemmed words for ~76x speedup on repeated words:
var (
stemCache sync.Map // Thread-safe cache
)
func StemCached(word string) string {
if cached, ok := stemCache.Load(word); ok {
return cached.(string)
}
stemmed := Stem(word)
stemCache.Store(word, stemmed)
return stemmed
}
I/O Optimization
Efficient File Walking
Use filepath.WalkDir instead of filepath.Walk:
// Good - efficient
err := filepath.WalkDir(root, func(path string, d fs.DirEntry, err error) error {
if d.IsDir() {
return nil
}
// Process file
})
// Bad - extra stat calls
err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error {
if info.IsDir() {
return nil
}
// Process file
})
Buffered I/O
// Use buffered writer for large outputs
file, _ := os.Create("output.html")
writer := bufio.NewWriterSize(file, 64*1024)
defer writer.Flush()
writer.WriteString(content)
Avoid Double ReadFile
Kosh optimizes image encoding (v1.2.1):
// Bad - reads file twice
data, _ := os.ReadFile(imagePath)
encoded := encodeImage(data)
writeToCache(encoded)
data2, _ := os.ReadFile(imagePath) // Duplicate read!
writeToDestination(data2)
// Good - read once, write twice
data, _ := os.ReadFile(imagePath)
encoded := encodeImage(data)
writeToCache(encoded)
writeToDestination(encoded)
Bytes vs String
Use bytes.Contains to avoid string allocation:
// Good - no allocation
if bytes.Contains(content, []byte("---")) {
// Has frontmatter
}
// Bad - converts to string
if strings.Contains(string(content), "---") {
// Has frontmatter
}
Profiling
CPU Profiling
# During build
kosh build --cpuprofile=cpu.prof
# Analyze
go tool pprof cpu.prof
Common pprof commands:
(pprof) top10 # Top 10 CPU consumers
(pprof) list Render # Show CPU time in Render function
(pprof) web # Visual graph (requires graphviz)
(pprof) pdf > cpu.pdf # Export to PDF
Memory Profiling
# During build
kosh build --memprofile=mem.prof
# Analyze
go tool pprof mem.prof
Finding memory leaks:
(pprof) top10 -cum # Top allocations (cumulative)
(pprof) list Cache # Show allocations in Cache
(pprof) inuse_space # Memory currently in use
(pprof) alloc_space # Total allocated memory
Live Profiling
import _ "net/http/pprof"
func main() {
go func() {
http.ListenAndServe("localhost:6060", nil)
}()
// Rest of application
}
Access profiles at:
http://localhost:6060/debug/pprof/
http://localhost:6060/debug/pprof/heap
http://localhost:6060/debug/pprof/goroutine
Build Metrics
Kosh tracks build performance metrics:
builder/metrics/metrics.go
type BuildMetrics struct {
StartTime time.Time
CacheHits int
CacheMisses int
PostsBuilt int
}
// Output format
// 📊 Built 150 posts in 2.3s (cache: 120/30 hits, 80%)
Metrics collected:
- Build duration
- Cache hit/miss ratio
- Posts processed
- Average post processing time
Metrics are suppressed in serve --dev mode to reduce noise during watch mode.
Optimization Checklist
Before optimizing:
Use Efficient Data Structures
// Use map for O(1) lookup
visited := make(map[string]bool)
if visited[id] {
return // Already processed
}
// Use sync.Map for concurrent access
var cache sync.Map
cache.Store(key, value)
value, ok := cache.Load(key)
Avoid Unnecessary Allocations
// Bad - allocates on every call
func getConfig() *Config {
return &Config{
Timeout: 30 * time.Second,
}
}
// Good - reuse singleton
var defaultConfig = &Config{
Timeout: 30 * time.Second,
}
func getConfig() *Config {
return defaultConfig
}
Minimize Interface Conversions
// Bad - repeated type assertions
func process(items []interface{}) {
for _, item := range items {
str := item.(string) // Type assertion in loop
processString(str)
}
}
// Good - use generics
func process[T any](items []T) {
for _, item := range items {
processItem(item) // No type assertion
}
}
Lazy Initialization
type Service struct {
expensiveResource *Resource
once sync.Once
}
func (s *Service) getResource() *Resource {
s.once.Do(func() {
s.expensiveResource = initializeResource()
})
return s.expensiveResource
}
- Build time: < 5s for 1000 posts (cold cache)
- Incremental build: < 500ms for single post
- Memory usage: < 500MB for 10,000 posts
- Cache hit rate: > 80% on incremental builds
- Search index size: < 30% of total content size
- WASM load time: < 2s on 3G connection