Skip to main content

Vector Search

Vector search allows you to perform semantic similarity searches using embeddings. This enables finding documents based on meaning rather than exact keyword matches. To perform vector search, set the mode to 'vector' and provide a vector with its property:
import { create, insertMultiple, search, MODE_VECTOR_SEARCH } from '@orama/orama'

const db = create({
  schema: {
    title: 'string',
    embedding: 'vector[5]' // 5-dimensional vector
  }
})

insertMultiple(db, [
  { title: 'The Prestige', embedding: [0.938293, 0.284951, 0.348264, 0.948276, 0.56472] },
  { title: 'Barbie', embedding: [0.192839, 0.028471, 0.284738, 0.937463, 0.092827] },
  { title: 'Oppenheimer', embedding: [0.827391, 0.927381, 0.001982, 0.983821, 0.294841] }
])

const results = search(db, {
  mode: MODE_VECTOR_SEARCH,
  vector: {
    value: [0.938292, 0.284961, 0.248264, 0.748276, 0.26472],
    property: 'embedding'
  },
  similarity: 0.85
})

Vector Schema Definition

Define vector properties in your schema with a specific dimension size:
const db = create({
  schema: {
    description: 'string',
    // Vector size must be declared during schema initialization
    embedding: 'vector[1536]' // OpenAI ada-002 embeddings
  }
})
The vector dimension must match the size of embeddings you provide. Common sizes:
  • OpenAI ada-002: 1536 dimensions
  • OpenAI text-embedding-3-small: 512 or 1536 dimensions
  • Sentence transformers: 384-768 dimensions

Search Parameters

vector

The vector configuration specifies which embedding to search with:
const results = search(db, {
  mode: 'vector',
  vector: {
    value: new Float32Array([0.1, 0.2, 0.3]),
    property: 'embedding'
  }
})
vector.value
number[] | Float32Array
required
The embedding vector to search with. Must match the dimension defined in the schema.
vector.property
string
required
The schema property containing the embeddings to compare against.

similarity

Set the minimum similarity threshold for results:
const results = search(db, {
  mode: 'vector',
  vector: {
    value: embeddings,
    property: 'embedding'
  },
  similarity: 0.8 // Default: 0.8 (range: 0-1)
})
Similarity is calculated using cosine similarity. A value of 1.0 means identical vectors, while 0.0 means completely different.

limit and offset

Control pagination of vector search results:
const results = search(db, {
  mode: 'vector',
  vector: {
    value: embeddings,
    property: 'embedding'
  },
  limit: 10,  // Return 10 results (default: 10)
  offset: 0   // Skip first 0 results (default: 0)
})

includeVectors

Control whether to include vectors in the response:
const results = search(db, {
  mode: 'vector',
  vector: {
    value: embeddings,
    property: 'embedding'
  },
  includeVectors: true // Default: false
})
Vectors can be very large. By default, Orama sets vectors to null in responses. Only set includeVectors: true if you need the actual embedding values.

Using with Secure Proxy Plugin

The Secure Proxy plugin can automatically convert search terms to vectors:
import { create } from '@orama/orama'
import { pluginSecureProxy } from '@orama/plugin-secure-proxy'

const db = create({
  schema: {
    title: 'string',
    description: 'string',
    embedding: 'vector[1536]'
  },
  plugins: [
    await pluginSecureProxy({ 
      apiKey: 'your-api-key',
      defaultProperty: 'embedding' 
    })
  ]
})

// The plugin will automatically convert the term to a vector
const result = search(db, {
  mode: 'vector',
  term: 'Noise cancelling headphones'
})

Generating Embeddings

With Plugin Embeddings

Use the embeddings plugin to generate vectors automatically:
import { create, insert, search } from '@orama/orama'
import { pluginEmbeddings } from '@orama/plugin-embeddings'
import '@tensorflow/tfjs-node'

const plugin = await pluginEmbeddings({
  embeddings: {
    defaultProperty: 'embeddings',
    onInsert: {
      generate: true,
      properties: ['description'],
      verbose: true
    }
  }
})

const db = create({
  schema: {
    description: 'string',
    embeddings: 'vector[512]' // Plugin generates 512-dimension vectors
  },
  plugins: [plugin]
})

// Embeddings generated automatically at insert time
await insert(db, { 
  description: 'Noise cancelling headphones' 
})

// Embeddings generated automatically at search time
const results = await search(db, {
  term: 'Headphones for students',
  mode: 'vector'
})
The @orama/plugin-embeddings plugin uses TensorFlow.js models and generates 512-dimensional vectors.

Manual Embedding Generation

You can generate embeddings using any embedding model:
import { create, insert, search } from '@orama/orama'
import OpenAI from 'openai'

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY })

async function generateEmbedding(text) {
  const response = await openai.embeddings.create({
    model: 'text-embedding-ada-002',
    input: text
  })
  return response.data[0].embedding
}

const db = create({
  schema: {
    title: 'string',
    embedding: 'vector[1536]'
  }
})

const embedding = await generateEmbedding('Noise cancelling headphones')
insert(db, {
  title: 'Premium Headphones',
  embedding
})

const queryEmbedding = await generateEmbedding('best headphones')
const results = search(db, {
  mode: 'vector',
  vector: {
    value: queryEmbedding,
    property: 'embedding'
  }
})

Performance Considerations

1

Choose Appropriate Vector Dimensions

Smaller dimensions (384-512) are faster but less precise. Larger dimensions (1536) are more accurate but slower.
2

Use Float32Array

For better performance, use Float32Array instead of regular arrays:
const vector = new Float32Array([0.1, 0.2, 0.3, ...])
3

Set Appropriate Similarity Threshold

Higher thresholds (0.9+) return fewer, more relevant results and are faster.
4

Limit Result Count

Use smaller limit values to improve performance:
search(db, { mode: 'vector', vector: {...}, limit: 5 })

Combining with Filters

You can combine vector search with filters:
const results = search(db, {
  mode: 'vector',
  vector: {
    value: embeddings,
    property: 'embedding'
  },
  where: {
    price: {
      lt: 100
    }
  }
})

Hybrid Search

Combine vector and full-text search

Filters

Filter vector search results

Facets

Generate facets from vector search results

Plugin Embeddings

Auto-generate embeddings with TensorFlow.js

Build docs developers (and LLMs) love