Skip to main content

Performance Optimization

Zero is designed for high performance with its incremental view maintenance (IVM) engine and intelligent query planning. This guide covers optimization strategies to get the most out of your Zero applications.

Query Optimization

Understanding the Query Planner

Zero uses a cost-based query planner that optimizes WHERE EXISTS (correlated subquery) statements by choosing optimal join execution strategies. The planner evaluates two join strategies:
  1. Semi-join (default): Parent is outer loop, child is inner
    // For each parent row:
    //   if EXISTS (matching child row):
    //     emit parent row
    
  2. Flipped join: Child is outer loop, parent is inner
    // For each child row:
    //   fetch matching parent row(s)
    //   emit results
    
The planner chooses the strategy with the lowest estimated cost based on table statistics. See packages/zql/src/planner/README.md for detailed planner documentation.

Index Optimization

Create indexes on join columns to enable efficient lookups:
// Schema definition
const post = table('post')
  .columns({
    id: string(),
    userId: string(),
    title: string(),
    published: boolean(),
  })
  .primaryKey('id');

// In your database migration:
CREATE INDEX idx_posts_userId ON posts(userId);
Why indexes matter:
  • Enable fast constraint lookups during joins
  • Improve query planner cost estimates
  • Critical for flipped join performance

Running ANALYZE

Zero’s planner uses SQLite statistics to estimate query costs. Run ANALYZE periodically to keep statistics fresh:
-- Run in your database
ANALYZE;
What ANALYZE provides:
  • sqlite_stat1: Basic table and index statistics
  • sqlite_stat4: Histogram data for accurate selectivity estimates
  • Fan-out factors for join operations
When to run ANALYZE:
  • After initial data load
  • After bulk imports or updates
  • When query performance degrades
  • Periodically (e.g., daily or weekly)

Avoid NULL in Unique Constraints

NULL values in OR queries can cause SQLite to abandon index optimization and perform full table scans:
// Problem: If email is NULL, this becomes a full table scan
SELECT * FROM users WHERE id = ? OR email = ?;

// Solution: Filter out NULL values before building the query
const validKeys = keys.filter(key => 
  key.every(column => row[column] !== null && row[column] !== undefined)
);
Performance impact:
  • NULL + OR can cause 320x slowdowns
  • A query that should take less than 1ms can take 320ms
  • SQLite won’t use MULTI-INDEX OR optimization
See the Known Gotchas section for more details.

Query Structure Best Practices

Use Limits Wisely

Limits propagate through the query pipeline and can significantly reduce work:
// Good: Limit early to reduce downstream work
z.query.users
  .whereExists('posts', p => p.where('published', true))
  .limit(10);

// The planner can optimize: Only scan enough users to find 10 matches
Limit behavior:
  • Semi-joins: Limit propagates to parent scan
  • Flipped joins: Limits are removed (must scan all children)
  • The planner considers this when choosing join strategy

Avoid Deep Nesting

Deeply nested queries can be expensive. Consider restructuring:
// Less efficient: Multiple levels of EXISTS
z.query.users
  .whereExists('posts', p => 
    p.whereExists('comments', c => 
      c.whereExists('likes')
    )
  );

// Better: Flatten when possible
z.query.posts
  .whereExists('comments', c => c.whereExists('likes'))
  .related('user');

Filter Early and Often

Apply filters as early as possible in the query:
// Good: Filter in the subquery
z.query.users.whereExists('posts', p => 
  p.where('published', true)
   .where('createdAt', '>', lastWeek)
);

// Filter selectivity affects cost estimates
// Highly selective filters → better plans

Memory and Resource Management

Understanding IVM Memory Usage

The IVM engine maintains materialized views in memory. Each query creates an operator pipeline that:
  1. Fetches initial results from the source
  2. Maintains incremental state for push updates
  3. Stores relationship streams for hierarchical data
Memory considerations:
  • Queries with large result sets consume more memory
  • Related data creates additional relationship storage
  • Multiple queries multiply memory usage

Cleanup and Lifecycle

Destroy queries when no longer needed:
const {data, cleanup} = z.query.users.run();

// When component unmounts or query is no longer needed:
cleanup();
What cleanup does:
  • Calls destroy() on all operators in the pipeline
  • Removes subscriptions to source changes
  • Frees memory used by materialized views

Operator Pipeline Efficiency

Each operator in the pipeline processes changes incrementally:
// Pipeline: Source → Filter → Join → Take
// Push updates flow through incrementally
Operators:
  • Filter: Evaluates conditions, may split edit changes
  • Join: Manages relationship streams, fetches child data
  • Take: Limits output, short-circuits when limit reached
  • UnionFanIn/Out: Manages OR branches
See packages/zql/src/ivm/operator.ts for operator interfaces.

Client-Side Optimization

Debounce Rapid Updates

If your app generates rapid mutations, consider debouncing:
import {debounce} from 'lodash';

const debouncedUpdate = debounce((value) => {
  z.mutate.updateSetting({id, value});
}, 300);

Batch Mutations

Group related mutations when possible:
// Less efficient: Multiple roundtrips
for (const item of items) {
  await z.mutate.createItem(item);
}

// Better: Single mutation with batch logic
await z.mutate.createItems(items);

Use Selective Subscriptions

Only subscribe to data you need:
// Subscribe only to specific fields
z.query.users
  .select(['id', 'name', 'email'])
  .run();

// Avoid fetching large fields if not needed
z.query.posts
  .where('published', true)
  .select(['id', 'title']) // Don't fetch 'content'
  .run();

Server-Side Optimization

Connection Pooling

Zero Cache uses connection pooling for PostgreSQL. Configure appropriately:
// In your zero-cache configuration
const pool = {
  min: 2,
  max: 10,
  idleTimeoutMillis: 30000,
};

Monitor Query Performance

Use OpenTelemetry integration to track query metrics:
// Metrics tracked automatically:
// - query-materialization-server
// - query-update-server
// - query-materialization-client
// - query-update-client
See packages/zql/src/query/metrics-delegate.ts for metric types.

Schema Design

Optimize your database schema:
  1. Use appropriate data types (e.g., INTEGER vs TEXT for IDs)
  2. Add foreign key constraints to help the query optimizer
  3. Denormalize selectively for read-heavy workloads
  4. Partition large tables if supported by your database

Benchmarking and Profiling

Enable Debug Logging

Use the planner debugger to analyze query plans:
import {AccumulatorDebugger} from 'zql/planner-debug';

const debugger = new AccumulatorDebugger();
planQuery(ast, costModel, debugger);

console.log(debugger.format());
Debug events tracked:
  • attempt-start: New flip pattern being evaluated
  • node-constraint: Constraint propagated to a node
  • node-cost: Cost computed for a node
  • best-plan-selected: Best plan chosen

Measure Query Materialization Time

Track how long queries take to materialize:
const start = performance.now();
const {data} = z.query.users.run();
const duration = performance.now() - start;

console.log(`Query materialized in ${duration}ms`);

Test with Realistic Data

Performance characteristics change with data scale:
  • Test with production-like data volumes
  • Include realistic data distributions (skewed data, NULLs, etc.)
  • Measure both initial fetch and incremental updates

Common Performance Pitfalls

1. Missing Indexes

Problem: Joins without indexes cause full table scans Solution: Create indexes on all foreign key columns

2. Stale Statistics

Problem: Planner makes poor decisions with outdated statistics Solution: Run ANALYZE regularly, especially after data changes

3. Too Many Active Queries

Problem: Each query maintains state, consuming memory Solution: Destroy queries when components unmount, use query consolidation

4. Inefficient Filters

Problem: Low-selectivity filters don’t reduce work much Solution: Add more selective filters, use compound conditions

5. Large Result Sets

Problem: Fetching thousands of rows is slow and memory-intensive Solution: Use pagination, virtual scrolling, or limit result sizes

Performance Checklist

  • Indexes created on all join columns
  • ANALYZE run on database
  • Query limits used where appropriate
  • Queries cleaned up when no longer needed
  • Filters applied early in query pipeline
  • OpenTelemetry metrics monitored
  • Tested with production-like data volumes
  • No NULL values in OR conditions
  • Connection pool sized appropriately
  • Mutations batched when possible

Next Steps

Build docs developers (and LLMs) love