Skip to main content

Overview

The upsert() method inserts new documents or updates existing ones in a collection. Documents are identified by their _id field. If a document with the given _id already exists, it will be completely replaced with the new document.
Every document must have an _id field. This is the unique identifier for the document in the collection.

Method Signature

upsert(docs: Array<Record<string, any>>): Promise<string>

Parameters

docs
Array<Record<string, any>>
required
An array of document objects to upsert. Each document must include an _id field.

Returns

lsn
string
The Log Sequence Number (LSN) at which the upsert was applied. This can be used for consistency guarantees in subsequent operations.

Examples

Basic Upsert

import { Client } from "topk-js";
import { f32Vector } from "topk-js/data";

const client = new Client({
  apiKey: "your-api-key",
  region: "us-east-1"
});

const lsn = await client.collection("books").upsert([
  {
    _id: "book_1",
    title: "The Great Gatsby",
    author: "F. Scott Fitzgerald",
    published_year: 1925,
    genres: ["Fiction", "Classic"],
    embedding: f32Vector([0.1, 0.2, 0.3, 0.4])
  },
  {
    _id: "book_2",
    title: "To Kill a Mockingbird",
    author: "Harper Lee",
    published_year: 1960,
    genres: ["Fiction", "Classic"],
    embedding: f32Vector([0.5, 0.6, 0.7, 0.8])
  }
]);

console.log(`Upserted at LSN: ${lsn}`);

Upserting with Sparse Vectors

import { Client } from "topk-js";
import { f32SparseVector } from "topk-js/data";

const client = new Client({
  apiKey: "your-api-key",
  region: "us-east-1"
});

const lsn = await client.collection("documents").upsert([
  {
    _id: "doc_1",
    content: "Machine learning and artificial intelligence",
    sparse_embedding: f32SparseVector({
      0: 0.42,
      15: 0.89,
      127: 0.31,
      543: 0.67
    })
  }
]);

Batch Upsert

import { Client } from "topk-js";

const client = new Client({
  apiKey: "your-api-key",
  region: "us-east-1"
});

// Prepare a large batch of documents
const documents = [];
for (let i = 0; i < 1000; i++) {
  documents.push({
    _id: `product_${i}`,
    name: `Product ${i}`,
    price: Math.random() * 100,
    category: ["electronics", "books", "clothing"][i % 3],
    in_stock: Math.random() > 0.5
  });
}

const lsn = await client.collection("products").upsert(documents);
console.log(`Batch upserted ${documents.length} documents at LSN: ${lsn}`);

Replacing Existing Documents

When you upsert a document with an existing _id, the entire document is replaced. Fields from the old document that are not included in the new document will be removed.
import { Client } from "topk-js";

const client = new Client({
  apiKey: "your-api-key",
  region: "us-east-1"
});

// First upsert
await client.collection("books").upsert([{
  _id: "book_1",
  title: "Original Title",
  author: "Original Author",
  year: 2020
}]);

// Second upsert - replaces the entire document
await client.collection("books").upsert([{
  _id: "book_1",
  title: "Updated Title",
  author: "Updated Author"
  // Note: 'year' field will be removed
}]);
If you want to merge fields instead of replacing the entire document, use the update() method instead.

Best Practices

  1. Batch Operations: For better performance, upsert multiple documents in a single call rather than making individual calls for each document.
  2. Unique IDs: Ensure each _id is unique within your collection. Use meaningful identifiers that help you reference documents later.
  3. Vector Data: Use the appropriate data constructor functions (f32Vector, f32SparseVector, etc.) when including vector embeddings in your documents.
  4. Schema Compliance: Make sure your documents conform to the collection’s schema, especially for required fields.
  • update() - Merge updates into existing documents without replacing them
  • get() - Retrieve documents by their IDs
  • delete() - Remove documents from the collection
  • query() - Search and filter documents

Build docs developers (and LLMs) love