Overview
Indices organize your data for efficient retrieval. LlamaIndex provides several index types optimized for different use cases.
VectorStoreIndex
Most common index type using vector embeddings for semantic search.
import { VectorStoreIndex, Document } from "llamaindex";
const documents = [
new Document({ text: "LlamaIndex is a data framework." }),
new Document({ text: "It helps build LLM applications." })
];
const index = await VectorStoreIndex.fromDocuments(documents);
Constructor Options
Vector store backend (defaults to SimpleVectorStore)
Service configuration (deprecated, use Settings instead)
Methods
Create index from documentsstatic async fromDocuments(
documents: Document[],
options?: { storageContext?: StorageContext }
): Promise<VectorStoreIndex>
Convert index to query engineasQueryEngine(options?: {
retriever?: BaseRetriever;
responseSynthesizer?: ResponseSynthesizer;
similarityTopK?: number;
}): BaseQueryEngine
Convert index to chat engineasChatEngine(options?: {
retriever?: BaseRetriever;
chatHistory?: ChatMessage[];
systemPrompt?: string;
}): BaseChatEngine
Convert index to retrieverasRetriever(options?: {
similarityTopK?: number;
mode?: "default" | "mmr";
}): BaseRetriever
Insert new documentasync insert(document: Document): Promise<void>
Insert new nodesasync insertNodes(nodes: BaseNode[]): Promise<void>
Delete document by IDasync deleteRef(docId: string): Promise<void>
Example: Custom Vector Store
import { VectorStoreIndex } from "llamaindex";
import { PineconeVectorStore } from "@llamaindex/pinecone";
const vectorStore = new PineconeVectorStore({
indexName: "my-index"
});
const index = await VectorStoreIndex.fromDocuments(documents, {
storageContext: { vectorStore }
});
Example: Persistence
import { VectorStoreIndex, storageContextFromDefaults } from "llamaindex";
// Create with persistence
const storageContext = await storageContextFromDefaults({
persistDir: "./storage"
});
const index = await VectorStoreIndex.fromDocuments(documents, {
storageContext
});
// Load from storage
const loadedContext = await storageContextFromDefaults({
persistDir: "./storage"
});
const loadedIndex = await VectorStoreIndex.fromVectorStore(
loadedContext.vectorStore,
{ storageContext: loadedContext }
);
SummaryIndex
Index that retrieves all nodes (useful for summarization).
import { SummaryIndex } from "llamaindex";
const index = await SummaryIndex.fromDocuments(documents);
const queryEngine = index.asQueryEngine({
responseSynthesizer: treeSummarizeSynthesizer
});
const summary = await queryEngine.query({
query: "Summarize all documents"
});
Use Cases
- Document summarization
- Full-text queries requiring all context
- Small document collections
KeywordTableIndex
Index based on keyword extraction.
import { KeywordTableIndex } from "llamaindex";
const index = await KeywordTableIndex.fromDocuments(documents);
const queryEngine = index.asQueryEngine({
mode: "rake" // or "simple"
});
const response = await queryEngine.query({
query: "machine learning algorithms"
});
rake: RAKE algorithm for keyword extraction
simple: Simple token-based extraction
Use Cases
- Keyword-based search
- Exact term matching
- Complement to vector search
Composable Indices
Combine multiple indices for hybrid search:
import { VectorStoreIndex, KeywordTableIndex } from "llamaindex";
const vectorIndex = await VectorStoreIndex.fromDocuments(docs1);
const keywordIndex = await KeywordTableIndex.fromDocuments(docs2);
const queryEngineTools = [
{
queryEngine: vectorIndex.asQueryEngine(),
description: "Semantic search"
},
{
queryEngine: keywordIndex.asQueryEngine(),
description: "Keyword search"
}
];
const routerEngine = new RouterQueryEngine({
selector: new LLMSingleSelector(),
queryEngineTools
});
Retrieval Modes
Default Retrieval
const retriever = index.asRetriever({
similarityTopK: 5
});
MMR (Maximal Marginal Relevance)
Diversity-based retrieval:
const retriever = index.asRetriever({
similarityTopK: 10,
mode: "mmr"
});
Filter nodes by metadata during retrieval:
import { MetadataFilters } from "@llamaindex/core/vector-store";
const documents = [
new Document({
text: "Doc 1",
metadata: { category: "tech", year: 2023 }
}),
new Document({
text: "Doc 2",
metadata: { category: "science", year: 2024 }
})
];
const index = await VectorStoreIndex.fromDocuments(documents);
const retriever = index.asRetriever({
filters: new MetadataFilters({
filters: [
{ key: "category", value: "tech" },
{ key: "year", value: 2023, operator: ">=" }
]
})
});
Index Updates
Insert Documents
const newDoc = new Document({ text: "New content" });
await index.insert(newDoc);
Delete Documents
await index.deleteRef(docId);
Refresh Index
const updatedDocs = documents.map(doc => {
doc.text = updatedText;
return doc;
});
// Delete old and insert new
for (const doc of oldDocs) {
await index.deleteRef(doc.id_);
}
for (const doc of updatedDocs) {
await index.insert(doc);
}
Best Practices
- Use VectorStoreIndex for most cases: Best for semantic search
- Persist indices: Save to disk to avoid reindexing
- Configure chunk size: Adjust via Settings for optimal retrieval
- Use external vector stores: Pinecone, Chroma, etc. for production
- Filter with metadata: Narrow search scope for better results
- Combine index types: Use RouterQueryEngine for hybrid search
See Also