Overview
The upsert() method inserts new documents or updates existing ones in a collection. Documents are identified by their _id field. If a document with the given _id already exists, it will be completely replaced with the new document.
Every document must have an _id field. This is the unique identifier for the document in the collection.
Method Signature
upsert(docs: Array<Record<string, any>>): Promise<string>
def upsert(documents: Sequence[Mapping[str, Any]]) -> str:
"""Insert or update documents in the collection."""
Parameters
docs
Array<Record<string, any>>
required
An array of document objects to upsert. Each document must include an _id field.
Returns
The Log Sequence Number (LSN) at which the upsert was applied. This can be used for consistency guarantees in subsequent operations.
Examples
Basic Upsert
import { Client } from "topk-js";
import { f32Vector } from "topk-js/data";
const client = new Client({
apiKey: "your-api-key",
region: "us-east-1"
});
const lsn = await client.collection("books").upsert([
{
_id: "book_1",
title: "The Great Gatsby",
author: "F. Scott Fitzgerald",
published_year: 1925,
genres: ["Fiction", "Classic"],
embedding: f32Vector([0.1, 0.2, 0.3, 0.4])
},
{
_id: "book_2",
title: "To Kill a Mockingbird",
author: "Harper Lee",
published_year: 1960,
genres: ["Fiction", "Classic"],
embedding: f32Vector([0.5, 0.6, 0.7, 0.8])
}
]);
console.log(`Upserted at LSN: ${lsn}`);
Upserting with Sparse Vectors
import { Client } from "topk-js";
import { f32SparseVector } from "topk-js/data";
const client = new Client({
apiKey: "your-api-key",
region: "us-east-1"
});
const lsn = await client.collection("documents").upsert([
{
_id: "doc_1",
content: "Machine learning and artificial intelligence",
sparse_embedding: f32SparseVector({
0: 0.42,
15: 0.89,
127: 0.31,
543: 0.67
})
}
]);
Batch Upsert
import { Client } from "topk-js";
const client = new Client({
apiKey: "your-api-key",
region: "us-east-1"
});
// Prepare a large batch of documents
const documents = [];
for (let i = 0; i < 1000; i++) {
documents.push({
_id: `product_${i}`,
name: `Product ${i}`,
price: Math.random() * 100,
category: ["electronics", "books", "clothing"][i % 3],
in_stock: Math.random() > 0.5
});
}
const lsn = await client.collection("products").upsert(documents);
console.log(`Batch upserted ${documents.length} documents at LSN: ${lsn}`);
Replacing Existing Documents
When you upsert a document with an existing _id, the entire document is replaced. Fields from the old document that are not included in the new document will be removed.
import { Client } from "topk-js";
const client = new Client({
apiKey: "your-api-key",
region: "us-east-1"
});
// First upsert
await client.collection("books").upsert([{
_id: "book_1",
title: "Original Title",
author: "Original Author",
year: 2020
}]);
// Second upsert - replaces the entire document
await client.collection("books").upsert([{
_id: "book_1",
title: "Updated Title",
author: "Updated Author"
// Note: 'year' field will be removed
}]);
If you want to merge fields instead of replacing the entire document, use the update() method instead.
Best Practices
-
Batch Operations: For better performance, upsert multiple documents in a single call rather than making individual calls for each document.
-
Unique IDs: Ensure each
_id is unique within your collection. Use meaningful identifiers that help you reference documents later.
-
Vector Data: Use the appropriate data constructor functions (
f32Vector, f32SparseVector, etc.) when including vector embeddings in your documents.
-
Schema Compliance: Make sure your documents conform to the collection’s schema, especially for required fields.
update() - Merge updates into existing documents without replacing them
get() - Retrieve documents by their IDs
delete() - Remove documents from the collection
query() - Search and filter documents