Skip to main content
Maximize the performance of your Orama search engine with these optimization strategies for indexing, searching, and memory management.

Indexing Performance

Batch Insertions

Always use insertMultiple instead of multiple insert calls:
import { create, insertMultiple } from '@orama/orama'

const db = create({ schema: { title: 'string', body: 'string' } })

// ❌ Slow: Individual inserts
for (const doc of documents) {
  await insert(db, doc)
}

// ✅ Fast: Batch insert
await insertMultiple(db, documents, 1000)
Performance gain: Batch insertions are 10-100x faster depending on dataset size because they optimize tree rebalancing operations.

AVL Tree Rebalancing

The insertMultiple method automatically optimizes AVL tree rebalancing:
packages/orama/src/methods/insert.ts
// Internal optimization: Defer rebalancing during batch operations
const options = { avlRebalanceThreshold: batch.length }
const id = await insert(orama, doc, language, skipHooks, options)
This defers expensive rebalancing operations until the batch is complete, significantly improving insertion speed.

Adjust Batch Size

Optimize batch size based on your document characteristics:
// Small documents (< 1KB)
await insertMultiple(db, docs, 5000)

// Medium documents (1-10KB) - default
await insertMultiple(db, docs, 1000)

// Large documents (> 10KB) or with vectors
await insertMultiple(db, docs, 100)
The default batch size is 1000. Adjust based on available memory and document complexity.

Search Performance

Limit Result Size

Always specify reasonable limits:
import { search } from '@orama/orama'

// ❌ Slow: Retrieving too many results
const results = await search(db, {
  term: 'javascript',
  limit: 10000
})

// ✅ Fast: Reasonable limit
const results = await search(db, {
  term: 'javascript',
  limit: 20
})

Use Offset Pagination Wisely

// Efficient pagination
const page1 = await search(db, { term: 'query', limit: 20, offset: 0 })
const page2 = await search(db, { term: 'query', limit: 20, offset: 20 })
const page3 = await search(db, { term: 'query', limit: 20, offset: 40 })
Deep pagination (large offsets) can impact performance. Orama still needs to process skipped results.

Property Selection

Return only the fields you need:
const results = await search(db, {
  term: 'javascript',
  properties: ['title', 'id'],  // Only return specific fields
  limit: 20
})
This reduces the amount of data transferred and processed, especially important for large documents.

Optimize Filters

// ✅ Efficient: Use indexed properties in filters
const results = await search(db, {
  term: 'laptop',
  where: {
    price: { lte: 1000 },
    inStock: true
  }
})

// Performance tip: Filters on indexed properties are fastest

Schema Design Optimization

Index Only Searchable Fields

Don’t make every field searchable if you don’t need to search it:
const db = create({
  schema: {
    title: 'string',        // Searchable
    description: 'string',  // Searchable
    metadata: {             // Not searchable - faster indexing
      created: 'string',
      updated: 'string'
    }
  }
})
Only properties defined in the top-level schema are indexed for search. Nested objects are stored but not searchable by default.

Use Appropriate Types

const db = create({
  schema: {
    price: 'number',        // ✅ Correct type for numeric filtering/sorting
    category: 'enum',       // ✅ Enum is faster than string for fixed values
    tags: 'string[]',       // ✅ Array type for multiple values
    inStock: 'boolean'      // ✅ Boolean for true/false values
  }
})

Vector Dimensionality

Keep vector dimensions reasonable:
// ❌ Overkill for most use cases
embedding: 'vector[1536]'  // OpenAI embeddings

// ✅ Often sufficient
embedding: 'vector[384]'   // MiniLM embeddings - 4x smaller, faster

// ✅ Very fast
embedding: 'vector[128]'   // Custom small embeddings

Smaller Vectors

Lower dimensions = faster vector search and less memory

Accuracy Trade-off

Test your use case to find the optimal dimension/accuracy balance

Memory Optimization

Minimize Document Storage

Store only what you need:
const db = create({
  schema: {
    id: 'string',
    title: 'string',
    searchableContent: 'string',  // Full content for search
    // Don't store large binary data or unnecessary metadata
  }
})

// Store references instead of full content
await insert(db, {
  id: 'doc-1',
  title: 'Article Title',
  searchableContent: content.substring(0, 5000), // Truncate if needed
  // url: 'https://...' // Store link to full content
})

Persistence Strategy

Use the data persistence plugin to save/load databases:
import { create, insertMultiple, save, load } from '@orama/orama'

// After initial indexing
const db = create({ schema })
await insertMultiple(db, documents)

// Save to disk/localStorage
const serialized = await save(db)
await fs.writeFile('db.json', JSON.stringify(serialized))

// Later: Load instead of re-indexing
const db2 = create({ schema })
const data = JSON.parse(await fs.readFile('db.json'))
load(db2, data)
Saving and loading a pre-built index is much faster than re-indexing, especially for large datasets.

Search Configuration Tuning

Typo Tolerance

Balance accuracy and performance:
// ❌ Slower: Maximum tolerance
const results = await search(db, {
  term: 'javascrpt',
  tolerance: 2
})

// ✅ Faster: Minimal tolerance
const results = await search(db, {
  term: 'javascrpt',
  tolerance: 1  // Usually sufficient
})

// ✅ Fastest: Exact match only
const results = await search(db, {
  term: 'javascript',
  exact: true
})

BM25 Relevance

Fine-tune BM25 parameters for your content:
const results = await search(db, {
  term: 'search query',
  relevance: {
    k: 1.2,  // Term frequency saturation (default: 1.2)
    b: 0.75, // Length normalization (default: 0.75)
    d: 0.5   // Document frequency scaling (default: 0.5)
  }
})

Boost Important Fields

const results = await search(db, {
  term: 'laptop',
  boost: {
    title: 2.0,        // Title matches are 2x more important
    description: 1.0,  // Normal weight
    tags: 1.5         // Tags are 1.5x more important
  }
})

Vector Search Optimization

Similarity Threshold

Set appropriate similarity thresholds:
const results = await search(db, {
  mode: 'vector',
  vector: {
    value: queryEmbedding,
    property: 'embedding'
  },
  similarity: 0.8,  // Only return results with 80%+ similarity
  limit: 10
})
Higher similarity thresholds (0.85-0.95) = fewer, more relevant results. Lower thresholds (0.6-0.75) = more results, potentially less relevant.

Don’t Return Vectors

const results = await search(db, {
  mode: 'vector',
  vector: { value: embedding, property: 'embedding' },
  includeVectors: false,  // ✅ Much smaller response size
  limit: 20
})
Excluding vectors from results dramatically reduces response size since embeddings can be hundreds of numbers.

Hybrid Search Strategy

const results = await search(db, {
  mode: 'hybrid',
  term: 'noise cancelling headphones',
  vector: {
    value: embedQuery('noise cancelling headphones'),
    property: 'embedding'
  },
  hybrid: {
    // Tune the fusion weight
    alpha: 0.5  // 0.5 = balanced, 0.7 = favor full-text, 0.3 = favor vector
  }
})
1

Start with alpha = 0.5

Balanced approach works for most use cases
2

Measure query performance

Test your actual queries and track response times
3

Adjust based on content

More structured data → higher alpha (favor full-text)More semantic data → lower alpha (favor vector)

Runtime Performance Monitoring

Measure Search Time

const results = await search(db, { term: 'query' })

console.log('Search time:', results.elapsed.formatted)  // e.g., "21μs"
console.log('Results found:', results.count)

Profile Different Strategies

const strategies = [
  { tolerance: 0, boost: {} },
  { tolerance: 1, boost: { title: 2 } },
  { tolerance: 2, boost: { title: 3, body: 1 } }
]

for (const strategy of strategies) {
  const start = performance.now()
  const results = await search(db, { term: 'test', ...strategy })
  const elapsed = performance.now() - start
  console.log(`Strategy ${strategy.tolerance}: ${elapsed}ms, ${results.count} results`)
}

Environment-Specific Optimizations

// Use smaller indexes for browser deployment
const db = create({
  schema: {
    id: 'string',
    title: 'string',
    // Minimal schema for client-side search
  }
})

// Load pre-built index from CDN
const response = await fetch('/search-index.json')
const data = await response.json()
load(db, data)
// Leverage more memory and parallel processing
import { Worker } from 'worker_threads'

// Create multiple database instances for parallel searches
const workers = Array.from({ length: 4 }, () => new Worker('./search-worker.js'))

// Distribute search queries across workers
function parallelSearch(queries) {
  return Promise.all(
    queries.map((query, i) => 
      sendToWorker(workers[i % workers.length], query)
    )
  )
}
// Optimize for edge function constraints
const db = create({ schema })

// Use cached pre-built index
const cached = await cache.get('search-index')
if (cached) {
  load(db, cached)
} else {
  // Build and cache
  await insertMultiple(db, documents)
  const serialized = save(db)
  await cache.set('search-index', serialized, { ttl: 3600 })
}

return search(db, params)

Performance Checklist

✅ Use insertMultiple

Batch insertions for 10-100x faster indexing

✅ Limit results

Keep limit under 100 for fast responses

✅ Optimize schema

Only index searchable fields

✅ Set reasonable tolerance

Tolerance of 1 is usually sufficient

✅ Use property selection

Return only needed fields

✅ Exclude vectors

Set includeVectors: false in responses

✅ Cache serialized DB

Load pre-built indexes when possible

✅ Monitor performance

Track elapsed times and optimize

Benchmarking Your Setup

import { create, insertMultiple, search } from '@orama/orama'

async function benchmark() {
  const db = create({ schema: { title: 'string', body: 'string' } })
  
  // Indexing benchmark
  const docs = Array.from({ length: 10000 }, (_, i) => ({
    title: `Document ${i}`,
    body: `This is the body content of document ${i}`
  }))
  
  const indexStart = performance.now()
  await insertMultiple(db, docs, 1000)
  const indexTime = performance.now() - indexStart
  console.log(`Indexed 10k docs in ${indexTime.toFixed(2)}ms`)
  
  // Search benchmark
  const searchStart = performance.now()
  const results = await search(db, { term: 'document', limit: 20 })
  const searchTime = performance.now() - searchStart
  console.log(`Search completed in ${searchTime.toFixed(2)}ms`)
  console.log(`Found ${results.count} results`)
}

benchmark()

Next Steps

Deployment Guide

Deploy your optimized Orama instance

Plugin System

Extend Orama with performance plugins

Vector Search

Optimize vector search performance

Data Persistence

Implement efficient caching strategies

Build docs developers (and LLMs) love