Quickstart

This quickstart guide walks you through creating a semantic search application with TopK. You’ll learn how to create a collection, add documents, and perform searches.

Before starting, make sure you have:

Installed the TopK SDK (Installation guide)
Obtained an API key from console.topk.io

What you’ll build

By the end of this guide, you’ll have a working search application that can:

Create a collection

Set up a collection with semantic search enabled

Insert documents

Add sample book data to your collection

Search semantically

Query your collection using natural language

Get ranked results

Retrieve and optionally rerank the most relevant results

Step 1: Initialize the client

First, create a TopK client with your API key and chosen region.

from topk_sdk import Client

client = Client(
    api_key="YOUR_TOPK_API_KEY",
    region="aws-us-east-1-elastica"
)

Replace YOUR_TOPK_API_KEY with your actual API key from the console.

Step 2: Create a collection

Create a collection called books with a semantic index on the title field. This enables both semantic and keyword search.

from topk_sdk.schema import text, semantic_index

client.collections().create(
    "books",
    schema={
        "title": text().required().index(semantic_index()),
    },
)

What’s happening here?

text() - Defines a text field
.required() - Makes the field mandatory for all documents
.index(semantic_index()) - Creates a semantic index with automatic embeddings

Fields not defined in the schema can still be added to documents. The schema only enforces types and indexes for specified fields.

Step 3: Add documents

Insert some sample book documents into your collection. Each document must have an _id field.

client.collection("books").upsert([
    {"_id": "gatsby", "title": "The Great Gatsby"},
    {"_id": "1984", "title": "1984"},
    {"_id": "catcher", "title": "The Catcher in the Rye"},
])

The upsert() method creates new documents or updates existing ones if a document with the same _id already exists.

Step 4: Search your collection

Now perform a semantic search to find books related to “classic American novel”.

from topk_sdk.query import select, fn, field

results = client.collection("books").query(
    select(
        "title",
        # Calculate semantic similarity between title and query
        title_similarity=fn.semantic_similarity(
            "title", 
            "classic American novel"
        ),
    )
    # Sort by similarity and return top 10 results
    .topk(field("title_similarity"), 10)
)

for doc in results:
    print(f"{doc['title']}: {doc['title_similarity']:.4f}")

Understanding the query

Let’s break down the query:

select() - Defines which fields to return and computed expressions
fn.semantic_similarity() - Computes semantic similarity score between the field and query text
.topk() - Sorts by the similarity score and limits results to top 10

Expected output

The Great Gatsby: 0.8532
The Catcher in the Rye: 0.7891
1984: 0.6234

Scores are similarity values - higher means more relevant. The exact values may vary based on the embedding model.

Step 5: Add reranking (optional)

Improve result relevance by adding reranking to your query:

from topk_sdk.query import select, fn, field

results = client.collection("books").query(
    select(
        "title",
        title_similarity=fn.semantic_similarity(
            "title", 
            "classic American novel"
        ),
    )
    .topk(field("title_similarity"), 10)
    # Add reranking to improve relevance
    .rerank()
)

The .rerank() method uses TopK’s built-in reranking model to refine the results, improving precision for the top results.

Learn more about reranking options in the Reranking guide.

Step 6: Clean up (optional)

When you’re done experimenting, you can delete the collection:

client.collections().delete("books")

Deleting a collection is irreversible. All documents and data in the collection will be permanently deleted.

Complete example

Here’s the full code in one place:

from topk_sdk import Client
from topk_sdk.schema import text, semantic_index
from topk_sdk.query import select, fn, field

# 1. Initialize client
client = Client(
    api_key="YOUR_TOPK_API_KEY",
    region="aws-us-east-1-elastica"
)

# 2. Create collection
client.collections().create(
    "books",
    schema={
        "title": text().required().index(semantic_index()),
    },
)

# 3. Add documents
client.collection("books").upsert([
    {"_id": "gatsby", "title": "The Great Gatsby"},
    {"_id": "1984", "title": "1984"},
    {"_id": "catcher", "title": "The Catcher in the Rye"},
])

# 4. Search with semantic similarity
results = client.collection("books").query(
    select(
        "title",
        title_similarity=fn.semantic_similarity(
            "title", 
            "classic American novel"
        ),
    )
    .topk(field("title_similarity"), 10)
    .rerank()
)

# 5. Display results
for doc in results:
    print(f"{doc['title']}: {doc['title_similarity']:.4f}")

# 6. Clean up (optional)
client.collections().delete("books")

Next steps

Now that you’ve built your first search application, explore more advanced features:

Schema design

Learn about field types, indexes, and schema validation

Advanced queries

Combine filters, keyword search, and semantic search

Vector search

Use custom embeddings for vector similarity search

Keyword search

Perform traditional text matching with BM25 scoring

Get Started

Core Concepts

Collections

Documents

Advanced

What you’ll build

Step 1: Initialize the client

Step 2: Create a collection

What’s happening here?

Step 3: Add documents

Step 4: Search your collection

Understanding the query

Expected output

Step 5: Add reranking (optional)

Step 6: Clean up (optional)

Complete example

Next steps

Schema design

Advanced queries

Vector search

Keyword search

Build docs developers (and LLMs) love

Get Started

Core Concepts

Collections

Documents

Advanced

​What you’ll build

​Step 1: Initialize the client

​Step 2: Create a collection

​What’s happening here?

​Step 3: Add documents

​Step 4: Search your collection

​Understanding the query

​Expected output

​Step 5: Add reranking (optional)

​Step 6: Clean up (optional)

​Complete example

​Next steps

Schema design

Advanced queries

Vector search

Keyword search

Build docs developers (and LLMs) love

What you’ll build

Step 1: Initialize the client

Step 2: Create a collection

What’s happening here?

Step 3: Add documents

Step 4: Search your collection

Understanding the query

Expected output

Step 5: Add reranking (optional)

Step 6: Clean up (optional)

Complete example

Next steps