Basic RAG Example

This example demonstrates how to build a basic RAG (Retrieval-Augmented Generation) application using LlamaIndex.TS.

Overview

RAG applications enhance LLM responses by retrieving relevant information from your documents before generating answers. This example shows:

Loading documents
Creating a vector index
Querying with a query engine
Interactive question-answering

Complete Example

Here’s a complete working RAG application:

rag-starter.ts

import { Document, VectorStoreIndex } from "llamaindex";
import fs from "node:fs/promises";
import { createInterface } from "node:readline/promises";

async function main() {
  const rl = createInterface({ input: process.stdin, output: process.stdout });

  if (!process.env.OPENAI_API_KEY) {
    console.log("OpenAI API key not found in environment variables.");
    console.log(
      "You can get an API key at https://platform.openai.com/account/api-keys",
    );
    process.env.OPENAI_API_KEY = await rl.question(
      "Please enter your OpenAI API key: ",
    );
  }

  // Load document
  const path = "node_modules/llamaindex/examples/abramov.txt";
  const essay = await fs.readFile(path, "utf-8");
  const document = new Document({ text: essay, id_: path });

  // Create vector index
  const index = await VectorStoreIndex.fromDocuments([document]);
  const queryEngine = index.asQueryEngine();

  console.log(
    "Try asking a question about the essay!",
    "\nExample: What did the author do in college?",
    "\n==============================\n",
  );
  
  // Interactive query loop
  while (true) {
    const query = await rl.question("Query: ");
    const response = await queryEngine.query({
      query,
    });
    console.log(response.toString());
  }
}

main().catch(console.error);

Step-by-Step Explanation

1. Import Dependencies

import { Document, VectorStoreIndex } from "llamaindex";
import fs from "node:fs/promises";
import { createInterface } from "node:readline/promises";

Document - Represents a document to be indexed
VectorStoreIndex - Creates a vector store for semantic search
fs - Read files from the filesystem
createInterface - Interactive CLI input

2. Load Your Document

const path = "node_modules/llamaindex/examples/abramov.txt";
const essay = await fs.readFile(path, "utf-8");
const document = new Document({ text: essay, id_: path });

The Document class wraps your text content with optional metadata and an identifier.

3. Create Vector Index

const index = await VectorStoreIndex.fromDocuments([document]);

This automatically:

Splits the document into chunks
Creates embeddings for each chunk
Stores them in a vector store

4. Create Query Engine

const queryEngine = index.asQueryEngine();

The query engine handles:

Embedding your query
Retrieving relevant chunks
Generating a response with the LLM

5. Query Your Data

const response = await queryEngine.query({
  query: "What did the author do in college?",
});
console.log(response.toString());

Advanced: With Source Attribution

Get source references with your responses:

import { MetadataMode, NodeWithScore } from "llamaindex";

const { message, sourceNodes } = await queryEngine.query({
  query: "What did the author do in college?",
});

console.log(message.content);

if (sourceNodes) {
  sourceNodes.forEach((source: NodeWithScore, index: number) => {
    console.log(
      `\n${index}: Score: ${source.score} - ${source.node.getContent(MetadataMode.NONE).substring(0, 50)}...\n`,
    );
  });
}

Custom Settings

Configure LLM and embedding models:

import { openai, OpenAIEmbedding } from "@llamaindex/openai";
import { Settings } from "llamaindex";

Settings.llm = openai({
  apiKey: process.env.OPENAI_API_KEY,
  model: "gpt-4o",
});
Settings.embedModel = new OpenAIEmbedding({
  model: "text-embedding-3-small",
});

Running the Example

Install dependencies:

npm install llamaindex @llamaindex/openai

Set your API key:

export OPENAI_API_KEY="sk-..."

Run the example:

npx tsx rag-starter.ts

Try It Yourself

Modify the example to:

Load your own documents
Use different LLM providers (Anthropic, Groq, etc.)
Customize chunking strategies
Add metadata filtering
Implement streaming responses

Next Steps

Chat Engine

Build conversational interfaces with chat history

Vector Stores

Use production vector databases like Pinecone, Qdrant, or Weaviate

Document Loading

Load PDFs, web pages, and other document types

Advanced RAG

Implement advanced retrieval patterns

Chat Engine - Conversational RAG
Metadata Filtering - Filter by metadata
Sentence Window - Advanced chunking

Examples

Overview

Complete Example

Step-by-Step Explanation

1. Import Dependencies

2. Load Your Document

3. Create Vector Index

4. Create Query Engine

5. Query Your Data

Advanced: With Source Attribution

Custom Settings

Running the Example

Try It Yourself

Next Steps

Chat Engine

Vector Stores

Document Loading

Advanced RAG

Build docs developers (and LLMs) love

Examples

​Overview

​Complete Example

​Step-by-Step Explanation

​1. Import Dependencies

​2. Load Your Document

​3. Create Vector Index

​4. Create Query Engine

​5. Query Your Data

​Advanced: With Source Attribution

​Custom Settings

​Running the Example

​Try It Yourself

​Next Steps

Chat Engine

Vector Stores

Document Loading

Advanced RAG

​Related Examples

Build docs developers (and LLMs) love

Overview

Complete Example

Step-by-Step Explanation

1. Import Dependencies

2. Load Your Document

3. Create Vector Index

4. Create Query Engine

5. Query Your Data

Advanced: With Source Attribution

Custom Settings

Running the Example

Try It Yourself

Next Steps

Related Examples