Overview
The Cohere provider integrates Cohere’s reranking API with LlamaIndex.TS to improve the relevance of search results. Reranking is a powerful technique to reorder retrieved documents based on their relevance to a query.
Installation
npm install @llamaindex/cohere
Basic Usage
import { CohereRerank } from "@llamaindex/cohere";
const reranker = new CohereRerank({
apiKey: process.env.COHERE_API_KEY,
topN: 5,
model: "rerank-english-v2.0"
});
// Use as a node postprocessor
const rerankedNodes = await reranker.postprocessNodes(nodes, query);
Constructor Options
Cohere API key (no environment variable default - must be provided)
Number of top results to return after reranking
model
string
default:"rerank-english-v2.0"
Cohere rerank model to use
Optional custom API endpoint URL
Request timeout in seconds
Supported Models
Rerank Models
rerank-english-v2.0: General-purpose English reranking (default)
rerank-multilingual-v2.0: Multilingual reranking support
rerank-english-v3.0: Latest English model
rerank-multilingual-v3.0: Latest multilingual model
With Query Engine
import { VectorStoreIndex } from "llamaindex";
import { CohereRerank } from "@llamaindex/cohere";
const index = await VectorStoreIndex.fromDocuments(documents);
const reranker = new CohereRerank({
apiKey: process.env.COHERE_API_KEY,
topN: 3,
model: "rerank-english-v3.0"
});
const queryEngine = index.asQueryEngine({
nodePostprocessors: [reranker]
});
const response = await queryEngine.query({
query: "What are the main features?"
});
With Retriever
import { VectorStoreIndex } from "llamaindex";
import { CohereRerank } from "@llamaindex/cohere";
const index = await VectorStoreIndex.fromDocuments(documents);
const retriever = index.asRetriever({ similarityTopK: 10 });
const reranker = new CohereRerank({
apiKey: process.env.COHERE_API_KEY,
topN: 5
});
// Retrieve initial results
const nodes = await retriever.retrieve({ query: "user query" });
// Rerank for better relevance
const rerankedNodes = await reranker.postprocessNodes(
nodes,
"user query"
);
Multilingual Reranking
const reranker = new CohereRerank({
apiKey: process.env.COHERE_API_KEY,
topN: 5,
model: "rerank-multilingual-v3.0"
});
const rerankedNodes = await reranker.postprocessNodes(
nodes,
"Quelles sont les principales caractéristiques?" // French query
);
Custom Base URL
const reranker = new CohereRerank({
apiKey: process.env.COHERE_API_KEY,
baseUrl: "https://custom-cohere-endpoint.com",
topN: 3
});
Configuration
Environment Variables
COHERE_API_KEY=your-api-key-here
Note: Unlike other providers, the Cohere package does not automatically read from environment variables. You must explicitly pass the API key.
How Reranking Works
- Initial Retrieval: Your retriever fetches top-K documents (e.g., 10-20)
- Reranking: Cohere’s model re-scores each document for relevance
- Top-N Selection: Returns only the most relevant N documents
- Score Update: Updates each node’s score to the relevance score
// Before reranking: 10 documents with embedding similarity scores
const initialNodes = await retriever.retrieve({ query, similarityTopK: 10 });
// After reranking: 3 most relevant documents with Cohere relevance scores
const reranker = new CohereRerank({ apiKey, topN: 3 });
const finalNodes = await reranker.postprocessNodes(initialNodes, query);
- Retrieve more, rerank to fewer: Retrieve 10-20 documents, rerank to top 3-5
- Use for complex queries: Most beneficial when semantic search alone isn’t sufficient
- Choose right model: v3.0 models offer better quality, v2.0 is faster
- Set appropriate timeout: For large document sets, increase timeout
Error Handling
try {
const rerankedNodes = await reranker.postprocessNodes(nodes, query);
} catch (error) {
if (error.message.includes("API key")) {
console.error("Invalid or missing Cohere API key");
} else {
console.error("Reranking failed:", error.message);
}
}
Use Cases
- Improve RAG quality: Rerank retrieved documents before sending to LLM
- Multi-stage retrieval: First pass with embeddings, second pass with reranking
- Cross-lingual search: Use multilingual models for queries in different languages
- Semantic search refinement: Improve relevance beyond vector similarity
Best Practices
- Always provide a query: Reranking requires a query string to work
- Retrieve enough candidates: Aim for 10-20 initial results for best reranking
- Don’t over-rerank: Top 3-5 results usually sufficient for most use cases
- Handle empty results: Check if initial retrieval returns documents
- Monitor costs: Reranking adds API costs, use judiciously
See Also