Understanding how data flows through LlamaIndex.TS from documents to query responses
LlamaIndex.TS processes data through a series of transformations, from raw documents to indexed embeddings to final query responses. Understanding this flow is crucial for building effective LLM applications.
import { Document } from "llamaindex";// Create documents from textconst documents = [ new Document({ text: "LlamaIndex is a data framework for LLM applications.", metadata: { source: "docs", page: 1 } }), new Document({ text: "It provides tools for data ingestion, indexing, and querying.", metadata: { source: "docs", page: 2 } }),];// Or load from filesimport { SimpleDirectoryReader } from "llamaindex";const reader = new SimpleDirectoryReader();const docs = await reader.loadData({ directoryPath: "./documents" });
Document Structure:
From @llamaindex/core/schema/node.ts
export class Document extends TextNode { id_: string; // Unique document ID text: string; // Document content metadata: Metadata; // Arbitrary metadata embedding?: number[]; // Optional embedding relationships: {...}; // Links to other nodes}
Documents are split into smaller chunks called Nodes:
Node parsing
import { SentenceSplitter, Settings } from "llamaindex";// Configure the node parserSettings.nodeParser = new SentenceSplitter({ chunkSize: 1024, // Max tokens per chunk chunkOverlap: 200, // Overlap between chunks});// Parse documents into nodesconst nodes = await Settings.nodeParser(documents);
Why chunk documents?
Context Window Limits
LLMs have maximum context windows (e.g., 4K, 8K, 128K tokens). Chunking ensures content fits within these limits.
Semantic Coherence
Smaller chunks often represent more coherent semantic units, improving retrieval accuracy.
Granular Retrieval
Fine-grained chunks allow more precise retrieval of relevant information.
Node Structure:
BaseNode from @llamaindex/core/schema
abstract class BaseNode<T extends Metadata = Metadata> { id_: string; // Unique node ID embedding?: number[]; // Vector embedding metadata: T; // Inherited + additional metadata excludedEmbedMetadataKeys: string[]; // Keys to exclude from embedding excludedLlmMetadataKeys: string[]; // Keys to exclude from LLM relationships: Record<NodeRelationship, RelatedNodeType>; hash: string; // Content hash for deduplication abstract getContent(metadataMode: MetadataMode): string;}
import { Settings } from "llamaindex";import { OpenAIEmbedding } from "@llamaindex/openai";Settings.embedModel = new OpenAIEmbedding({ model: "text-embedding-3-large", dimensions: 1024,});// Embeddings are generated automatically during indexingconst index = await VectorStoreIndex.fromDocuments(documents);
How it works:
From packages/llamaindex/src/indices/vectorStore/index.ts
async getNodeEmbeddingResults(nodes: BaseNode[]): Promise<BaseNode[]> { const nodeMap = splitNodesByType(nodes); for (const type in nodeMap) { const nodes = nodeMap[type as ModalityType]; const embedModel = this.vectorStores[type]?.embedModel ?? this.embedModel; if (embedModel && nodes) { await embedModel(nodes, { logProgress: options?.logProgress, }); } } return nodes;}
Embeddings are vector representations of text that capture semantic meaning. Similar texts have similar embeddings.
const query = "What is LlamaIndex?";// Query is embedded using the same embedding modelconst queryEmbedding = await Settings.embedModel.getQueryEmbedding(query);
Always use the same embedding model for indexing and querying to ensure compatibility.