Skip to main content

Overview

The Orama Embeddings Plugin allows you to generate fast text embeddings at insert and search time offline, directly on your machine - no external API or OpenAI needed! This plugin uses TensorFlow.js and the Universal Sentence Encoder model to create 512-dimensional vector embeddings from text.
For cloud-based embeddings with multiple model options, see the Secure Proxy Plugin.

Installation

Install the embeddings plugin:
npm install @orama/plugin-embeddings
You must also install a TensorFlow.js backend appropriate for your environment.

Browser Environment

For browser environments, install the WebGL backend:
npm install @tensorflow/tfjs-backend-webgl

Node.js Environment

For Node.js, install the Node.js backend:
npm install @tensorflow/tfjs-node

Available TensorFlow.js Backends

  • @tensorflow/tfjs - Default (includes CPU and WebGL backends)
  • @tensorflow/tfjs-node - Node.js with native TensorFlow bindings (recommended for Node.js)
  • @tensorflow/tfjs-backend-webgl - WebGL acceleration (recommended for browsers)
  • @tensorflow/tfjs-backend-cpu - CPU-only backend
  • @tensorflow/tfjs-node-gpu - GPU-accelerated for Node.js
  • @tensorflow/tfjs-backend-wasm - WebAssembly backend

Basic Usage

import { create, insert, search } from '@orama/orama'
import { pluginEmbeddings } from '@orama/plugin-embeddings'
import '@tensorflow/tfjs-node' // Or appropriate backend

// Initialize the plugin
const plugin = await pluginEmbeddings({
  embeddings: {
    defaultProperty: 'embeddings',
    onInsert: {
      generate: true,
      properties: ['description'],
      verbose: true
    }
  }
})

// Create database with the plugin
const db = await create({
  schema: {
    description: 'string',
    embeddings: 'vector[512]' // Plugin generates 512-dimension vectors
  },
  plugins: [plugin]
})
The Universal Sentence Encoder model generates 512-dimensional vectors, so your vector field must be defined as vector[512].

Configuration

Plugin Options

embeddings
object
required
Configuration for embeddings generation

Insert-Time Embeddings

When configured with onInsert.generate: true, the plugin automatically generates embeddings when you insert documents:
await insert(db, {
  description: 'Classroom Headphones Bulk 5 Pack, Student On Ear Color Varieties'
})

await insert(db, {
  description: 'Kids Wired Headphones for School Students K-12'
})

await insert(db, {
  description: 'Kids Headphones Bulk 5-Pack for K-12 School'
})

await insert(db, {
  description: 'Bose QuietComfort Bluetooth Headphones'
})
The plugin will automatically:
  1. Combine the specified properties into a single text string
  2. Generate embeddings using the Universal Sentence Encoder
  3. Normalize the embedding vectors
  4. Store them in the configured property
Perform vector search using natural language queries:
const results = await search(db, {
  term: 'Headphones for 12th grade students',
  mode: 'vector'
})
The plugin automatically:
  1. Generates embeddings for the search term
  2. Performs vector similarity search
  3. Returns the most relevant results
Combine full-text and vector search for the best results:
const results = await search(db, {
  term: 'noise cancelling headphones',
  mode: 'hybrid'
})
Hybrid search combines traditional keyword matching with semantic vector search for improved relevance.

Multiple Properties

Generate embeddings from multiple document properties:
const plugin = await pluginEmbeddings({
  embeddings: {
    defaultProperty: 'embeddings',
    onInsert: {
      generate: true,
      properties: ['title', 'description', 'category'],
      verbose: true
    }
  }
})

const db = await create({
  schema: {
    title: 'string',
    description: 'string',
    category: 'string',
    embeddings: 'vector[512]'
  },
  plugins: [plugin]
})

await insert(db, {
  title: 'Wireless Headphones',
  description: 'Premium noise-cancelling headphones',
  category: 'Electronics'
})
// Embeddings generated from: "Wireless Headphones. Premium noise-cancelling headphones. Electronics"

How It Works

The plugin implements two main hooks:

beforeInsert Hook

At /home/daytona/workspace/source/packages/plugin-embeddings/src/index.ts:41-60:
async beforeInsert<T extends TypedDocument<any>>(_db: AnyOrama, _id: string, params: PartialSchemaDeep<T>) {
  if (!pluginParams.embeddings?.onInsert?.generate) {
    return
  }

  const properties = pluginParams.embeddings.onInsert.properties
  const values = getPropertiesValues(params, properties)

  if (pluginParams.embeddings.onInsert.verbose) {
    console.log(`Generating embeddings for properties "${properties.join(', ')}": "${values}"`)
  }

  const embeddings = Array.from(await (await model.embed(values)).data())
  params[pluginParams.embeddings.defaultProperty] = normalizeVector(embeddings)
}

beforeSearch Hook

At /home/daytona/workspace/source/packages/plugin-embeddings/src/index.ts:62-89:
async beforeSearch<T extends AnyOrama>(_db: AnyOrama, params: SearchParams<T, TypedDocument<any>>) {
  if (params.mode !== 'vector' && params.mode !== 'hybrid') {
    return
  }

  if (params?.vector?.value) {
    return
  }

  if (!params.term) {
    throw new Error('No "term" or "vector" parameters were provided')
  }

  const embeddings = Array.from(await (await model.embed(params.term)).data())
  
  if (!params.vector) {
    params.vector = {
      property: pluginParams.embeddings.defaultProperty,
      value: normalizeVector(embeddings)
    }
  }

  params.vector.value = normalizeVector(embeddings)
}

Performance Considerations

The Universal Sentence Encoder model needs to be downloaded on first use (approximately 50MB). Subsequent uses will be faster as the model is cached.

Model Loading

// Model is loaded once when the plugin is initialized
const plugin = await pluginEmbeddings({
  // configuration
})

// Reuse the same plugin instance for multiple databases
const db1 = await create({ schema, plugins: [plugin] })
const db2 = await create({ schema, plugins: [plugin] })

Backend Selection

  • WebGL: Best for browsers, GPU-accelerated
  • tfjs-node: Best for Node.js, native TensorFlow bindings
  • tfjs-node-gpu: Best for Node.js with CUDA-capable GPU
  • WASM: Good balance for browsers without WebGL

Comparison with Secure Proxy

FeatureEmbeddings PluginSecure Proxy Plugin
Offline✅ Yes❌ No (requires API)
ModelUniversal Sentence EncoderMultiple models available
Dimensions512384, 768, 1024, 1536, 3072
CostFreeAPI usage fees
SetupTensorFlow.js requiredAPI key required
PerformanceDepends on hardwareConsistent cloud performance

Troubleshooting

Model Loading Errors

If you encounter model loading errors, ensure you have the correct TensorFlow.js backend installed:
# For Node.js
npm install @tensorflow/tfjs-node

# For browsers
npm install @tensorflow/tfjs-backend-webgl

Memory Issues

The model requires significant memory. If you encounter memory errors:
  1. Use a smaller batch size for bulk inserts
  2. Consider using the WASM backend for lower memory usage
  3. Increase Node.js memory limit: node --max-old-space-size=4096

Next Steps

Vector Search

Learn more about vector search in Orama

Secure Proxy Plugin

Explore cloud-based embeddings with multiple models

Build docs developers (and LLMs) love