Keyword search enables full-text search on text fields using BM25 scoring and boolean term matching. Unlike semantic search which finds conceptually similar content, keyword search finds exact term matches with relevance ranking.
Overview
With keyword search, you can:
- Search for exact terms and phrases in text fields
- Use BM25 scoring for relevance ranking
- Combine multiple search terms with AND/OR logic
- Weight terms differently for custom relevance
- Filter results using match operations
Schema Setup
To enable keyword search on a field, add a keywordIndex() to a text field:
import { Client } from "topk-js";
import { text, keywordIndex } from "topk-js/schema";
const client = new Client({
apiKey: "YOUR_API_KEY",
region: "aws-us-east-1-elastica"
});
await client.collections().create("books", {
title: text().index(keywordIndex()),
summary: text().required().index(keywordIndex())
});
Inserting Documents
Insert documents with text content:
await client.collection("books").upsert([
{
_id: "gatsby",
title: "The Great Gatsby",
summary: "A story about love and the American Dream in the Jazz Age"
},
{
_id: "mockingbird",
title: "To Kill a Mockingbird",
summary: "A novel about racial injustice and moral growth in the American South"
},
{
_id: "1984",
title: "1984",
summary: "A dystopian story about totalitarian surveillance and control"
}
]);
Searching with match()
Use match() to filter documents by keyword:
import { filter, match, field } from "topk-js/query";
// Match a single term
const results = await client.collection("books").query(
filter(match("love", { field: "summary" }))
.topk(field("published_year"), 10, true)
);
Match Options
Customize matching behavior:
import { filter, match } from "topk-js/query";
// Match with options
const results = await client.collection("books").query(
filter(
match("american dream", {
field: "summary",
weight: 2.0, // Boost this term's importance
all: false // Match ANY word (OR logic)
})
)
);
Combining Matches
Use boolean operators to combine multiple matches:
import { filter, match } from "topk-js/query";
// OR: Match documents containing either term
const results1 = await client.collection("books").query(
filter(
match("love", { field: "summary" })
.or(match("totalitarian", { field: "summary" }))
)
);
// AND: Match documents containing both terms
const results2 = await client.collection("books").query(
filter(
match("racial", { field: "summary" })
.and(match("injustice", { field: "summary" }))
)
);
// Using operators
const results3 = await client.collection("books").query(
filter(
match("love", { field: "summary" }) | match("dream", { field: "title" })
)
);
Use | and & operators (not or and and keywords) when combining match expressions.
BM25 Scoring
Use fn.bm25Score() to rank results by relevance:
import { select, filter, match, field, fn } from "topk-js/query";
const results = await client.collection("books").query(
select({
title: field("title"),
score: fn.bm25Score()
})
.filter(match("american", { field: "summary" }))
.topk(field("score"), 10)
);
fn.bm25Score() can only be used when there is at least one match() expression in the filter. It computes relevance based on term frequency and document frequency.
Weighted Term Matching
Boost specific terms for custom relevance:
import { select, filter, match, field, fn } from "topk-js/query";
const results = await client.collection("books").query(
select({
title: field("title"),
score: fn.bm25Score()
})
.filter(
match("love", { weight: 30.0 }) |
match("young", { weight: 10.0 })
)
.topk(field("score"), 10)
);
Field Methods for Matching
Use field methods for more expressive matching:
matchAny() and matchAll()
import { filter, field } from "topk-js/query";
// Match any of the terms (OR logic)
const results1 = await client.collection("books").query(
filter(
field("summary").matchAny("love dream")
)
);
// Match all terms (AND logic)
const results2 = await client.collection("books").query(
filter(
field("summary").matchAll("racial injustice")
)
);
// Match with tokenized arrays
const results3 = await client.collection("books").query(
filter(
field("tags").matchAny(["fiction", "classic", "dystopian"])
)
);
Combining with Other Filters
import { filter, field } from "topk-js/query";
const results = await client.collection("books").query(
filter(
field("summary").matchAll("love class")
.or(field("published_year").eq(1925))
)
.topk(field("published_year"), 10, true)
);
Stop Words
Common stop words (like “the”, “a”, “an”) are automatically filtered out:
import { filter, match } from "topk-js/query";
// This will return no results because "the" is a stop word
const results = await client.collection("books").query(
filter(match("the", { field: "summary" }))
);
Use Cases
Keyword search is ideal for:
- Exact term matching: Find documents containing specific words or phrases
- Boolean search: Combine multiple terms with AND/OR logic
- Document filtering: Pre-filter documents before applying other search techniques
- Faceted search: Filter by categories or tags using keyword matching
- Traditional search UIs: Support familiar keyword-based search experiences
For best results with BM25 scoring, use multiple search terms and combine them with appropriate weights to reflect their importance.
Keyword search is case-insensitive and automatically tokenizes search queries, so “LOVE” and “love” are treated identically.