Skip to main content
This example demonstrates how to build a basic RAG (Retrieval-Augmented Generation) application using LlamaIndex.TS.

Overview

RAG applications enhance LLM responses by retrieving relevant information from your documents before generating answers. This example shows:
  1. Loading documents
  2. Creating a vector index
  3. Querying with a query engine
  4. Interactive question-answering

Complete Example

Here’s a complete working RAG application:
rag-starter.ts
import { Document, VectorStoreIndex } from "llamaindex";
import fs from "node:fs/promises";
import { createInterface } from "node:readline/promises";

async function main() {
  const rl = createInterface({ input: process.stdin, output: process.stdout });

  if (!process.env.OPENAI_API_KEY) {
    console.log("OpenAI API key not found in environment variables.");
    console.log(
      "You can get an API key at https://platform.openai.com/account/api-keys",
    );
    process.env.OPENAI_API_KEY = await rl.question(
      "Please enter your OpenAI API key: ",
    );
  }

  // Load document
  const path = "node_modules/llamaindex/examples/abramov.txt";
  const essay = await fs.readFile(path, "utf-8");
  const document = new Document({ text: essay, id_: path });

  // Create vector index
  const index = await VectorStoreIndex.fromDocuments([document]);
  const queryEngine = index.asQueryEngine();

  console.log(
    "Try asking a question about the essay!",
    "\nExample: What did the author do in college?",
    "\n==============================\n",
  );
  
  // Interactive query loop
  while (true) {
    const query = await rl.question("Query: ");
    const response = await queryEngine.query({
      query,
    });
    console.log(response.toString());
  }
}

main().catch(console.error);

Step-by-Step Explanation

1. Import Dependencies

import { Document, VectorStoreIndex } from "llamaindex";
import fs from "node:fs/promises";
import { createInterface } from "node:readline/promises";
  • Document - Represents a document to be indexed
  • VectorStoreIndex - Creates a vector store for semantic search
  • fs - Read files from the filesystem
  • createInterface - Interactive CLI input

2. Load Your Document

const path = "node_modules/llamaindex/examples/abramov.txt";
const essay = await fs.readFile(path, "utf-8");
const document = new Document({ text: essay, id_: path });
The Document class wraps your text content with optional metadata and an identifier.

3. Create Vector Index

const index = await VectorStoreIndex.fromDocuments([document]);
This automatically:
  • Splits the document into chunks
  • Creates embeddings for each chunk
  • Stores them in a vector store

4. Create Query Engine

const queryEngine = index.asQueryEngine();
The query engine handles:
  • Embedding your query
  • Retrieving relevant chunks
  • Generating a response with the LLM

5. Query Your Data

const response = await queryEngine.query({
  query: "What did the author do in college?",
});
console.log(response.toString());

Advanced: With Source Attribution

Get source references with your responses:
import { MetadataMode, NodeWithScore } from "llamaindex";

const { message, sourceNodes } = await queryEngine.query({
  query: "What did the author do in college?",
});

console.log(message.content);

if (sourceNodes) {
  sourceNodes.forEach((source: NodeWithScore, index: number) => {
    console.log(
      `\n${index}: Score: ${source.score} - ${source.node.getContent(MetadataMode.NONE).substring(0, 50)}...\n`,
    );
  });
}

Custom Settings

Configure LLM and embedding models:
import { openai, OpenAIEmbedding } from "@llamaindex/openai";
import { Settings } from "llamaindex";

Settings.llm = openai({
  apiKey: process.env.OPENAI_API_KEY,
  model: "gpt-4o",
});
Settings.embedModel = new OpenAIEmbedding({
  model: "text-embedding-3-small",
});

Running the Example

  1. Install dependencies:
npm install llamaindex @llamaindex/openai
  1. Set your API key:
export OPENAI_API_KEY="sk-..."
  1. Run the example:
npx tsx rag-starter.ts

Try It Yourself

Modify the example to:
  • Load your own documents
  • Use different LLM providers (Anthropic, Groq, etc.)
  • Customize chunking strategies
  • Add metadata filtering
  • Implement streaming responses

Next Steps

Chat Engine

Build conversational interfaces with chat history

Vector Stores

Use production vector databases like Pinecone, Qdrant, or Weaviate

Document Loading

Load PDFs, web pages, and other document types

Advanced RAG

Implement advanced retrieval patterns

Build docs developers (and LLMs) love