Overview
RAG combines document retrieval with AI generation, allowing your agent to provide accurate answers grounded in your specific knowledge base. What you’ll learn:- Ingest and process documents
- Create vector embeddings
- Perform semantic search
- Retrieve relevant context
- Generate accurate, grounded responses
Quick Start
Prepare Documents
Create a
docs/ folder with text files:mkdir docs
echo "Your knowledge base content here" > docs/guide.txt
Complete Example
rag-chatbot.ts
import {
AgentRuntime,
createMessageMemory,
stringToUuid,
type UUID,
} from "@elizaos/core";
import { openaiPlugin } from "@elizaos/plugin-openai";
import { plugin as sqlPlugin } from "@elizaos/plugin-sql";
import { v4 as uuidv4 } from "uuid";
import { readFileSync, readdirSync } from "fs";
import { join } from "path";
import * as readline from "readline";
// Document ingestion
async function ingestDocuments(
runtime: AgentRuntime,
docsPath: string
): Promise<void> {
console.log("📚 Ingesting documents...");
const files = readdirSync(docsPath).filter(
(f) => f.endsWith(".txt") || f.endsWith(".md")
);
for (const file of files) {
const content = readFileSync(join(docsPath, file), "utf-8");
// Split into chunks
const chunks = chunkText(content, 500);
// Store each chunk with embeddings
for (let i = 0; i < chunks.length; i++) {
const chunk = chunks[i];
// Create embedding
const embedding = await runtime.embed(chunk);
// Store in vector database
await runtime.knowledgeManager!.createMemory({
id: uuidv4() as UUID,
roomId: stringToUuid("knowledge-base"),
entityId: stringToUuid("system"),
content: {
text: chunk,
metadata: {
source: file,
chunkIndex: i,
totalChunks: chunks.length,
},
},
embedding,
});
}
console.log(` ✅ Processed ${file} (${chunks.length} chunks)`);
}
console.log("✅ Document ingestion complete\n");
}
// Text chunking utility
function chunkText(text: string, maxChunkSize: number): string[] {
const chunks: string[] = [];
const paragraphs = text.split("\n\n");
let currentChunk = "";
for (const paragraph of paragraphs) {
if (currentChunk.length + paragraph.length > maxChunkSize) {
if (currentChunk) {
chunks.push(currentChunk.trim());
}
currentChunk = paragraph;
} else {
currentChunk += (currentChunk ? "\n\n" : "") + paragraph;
}
}
if (currentChunk) {
chunks.push(currentChunk.trim());
}
return chunks.filter((c) => c.length > 0);
}
// Semantic search
async function searchKnowledge(
runtime: AgentRuntime,
query: string,
limit: number = 3
): Promise<string[]> {
// Create query embedding
const queryEmbedding = await runtime.embed(query);
// Search for similar chunks
const results = await runtime.knowledgeManager!.searchMemories({
roomId: stringToUuid("knowledge-base"),
embedding: queryEmbedding,
limit,
});
return results.map((r) => r.content.text);
}
// Create RAG-enabled character
const character = {
name: "DocBot",
bio: "A knowledgeable assistant with access to your document library.",
system: `You are DocBot, an assistant with access to a knowledge base.
When answering questions:
1. Use the provided context from the knowledge base
2. Be accurate and cite sources when possible
3. If information isn't in the context, say so clearly
4. Provide helpful, detailed answers based on the documents
Always ground your responses in the provided context.`,
};
console.log("🚀 Initializing RAG chatbot...");
// Create runtime
const runtime = new AgentRuntime({
character,
plugins: [sqlPlugin, openaiPlugin],
});
await runtime.initialize();
// Ingest documents
const docsPath = "./docs";
await ingestDocuments(runtime, docsPath);
// Setup chat
const userId = uuidv4() as UUID;
const roomId = stringToUuid("rag-chat");
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
console.log("💬 Ask questions about your documents (type 'exit' to quit)\n");
const prompt = () => {
rl.question("You: ", async (input) => {
const text = input.trim();
if (text.toLowerCase() === "exit") {
console.log("\n👋 Goodbye!");
rl.close();
await runtime.stop();
process.exit(0);
}
if (!text) {
prompt();
return;
}
// Search knowledge base
console.log("\n🔍 Searching knowledge base...");
const relevantChunks = await searchKnowledge(runtime, text, 3);
console.log(`Found ${relevantChunks.length} relevant chunks\n`);
// Create message with context
const contextMessage = `Context from knowledge base:
${relevantChunks.map((chunk, i) => `[${i + 1}] ${chunk}`).join("\n\n")}
---
User question: ${text}`;
const message = createMessageMemory({
id: uuidv4() as UUID,
entityId: userId,
roomId,
content: { text: contextMessage },
});
// Get response
process.stdout.write("DocBot: ");
await runtime.messageService!.handleMessage(
runtime,
message,
async (content) => {
if (content?.text) {
process.stdout.write(content.text);
}
return [];
}
);
console.log("\n");
prompt();
});
};
prompt();
Example Interaction
🚀 Initializing RAG chatbot...
📚 Ingesting documents...
✅ Processed product-guide.txt (12 chunks)
✅ Processed faq.txt (8 chunks)
✅ Processed pricing.txt (5 chunks)
✅ Document ingestion complete
💬 Ask questions about your documents (type 'exit' to quit)
You: What are the pricing plans?
🔍 Searching knowledge base...
Found 3 relevant chunks
DocBot: Based on the pricing information, we offer three plans:
1. **Starter Plan** - $29/month
- Up to 10 users
- 100GB storage
- Email support
2. **Professional Plan** - $99/month
- Up to 50 users
- 1TB storage
- Priority support
- Advanced analytics
3. **Enterprise Plan** - Custom pricing
- Unlimited users
- Custom storage
- 24/7 dedicated support
- Custom integrations
All plans include a 14-day free trial with no credit card required.
You: How do I reset my password?
🔍 Searching knowledge base...
Found 3 relevant chunks
DocBot: To reset your password:
1. Go to the login page
2. Click "Forgot Password" below the login form
3. Enter your email address
4. Check your email for a reset link (check spam folder if not received)
5. Click the link and create a new password
6. Your new password must be at least 8 characters with a mix of letters and numbers
If you don't receive the email within 5 minutes, contact [email protected]
Document Formats
Plain Text
docs/guide.txt
Getting Started
Welcome to our product! This guide will help you get started.
Installation
To install, run: npm install product-name
Configuration
Create a config.json file with your settings...
Markdown
docs/api-reference.md
# API Reference
## Authentication
All API requests require authentication using an API key.
### Headers
## Endpoints
### GET /users
Retrieve list of users...
JSON
For structured data:import { readFileSync } from "fs";
const data = JSON.parse(readFileSync("./docs/data.json", "utf-8"));
const text = JSON.stringify(data, null, 2); // Convert to text
Advanced Features
Metadata Filtering
Filter searches by metadata:async function searchWithFilter(
runtime: AgentRuntime,
query: string,
source: string
) {
const queryEmbedding = await runtime.embed(query);
const results = await runtime.knowledgeManager!.searchMemories({
roomId: stringToUuid("knowledge-base"),
embedding: queryEmbedding,
limit: 5,
filter: {
"metadata.source": source, // Only search specific document
},
});
return results.map((r) => r.content.text);
}
Citation Tracking
Add source citations to responses:const contextWithSources = relevantChunks
.map(
(chunk, i) =>
`[Source: ${chunk.metadata?.source}]\n${chunk.text}`
)
.join("\n\n");
const message = `Context:
${contextWithSources}
User question: ${text}
Please cite sources in your answer using [Source: filename] format.`;
Hybrid Search
Combine vector search with keyword matching:async function hybridSearch(
runtime: AgentRuntime,
query: string
) {
// Vector search
const vectorResults = await searchKnowledge(runtime, query, 5);
// Keyword search
const keywordResults = await runtime.knowledgeManager!.searchMemories({
roomId: stringToUuid("knowledge-base"),
query: query, // Text-based search
limit: 5,
});
// Merge and deduplicate
const combined = [...vectorResults, ...keywordResults];
const unique = Array.from(new Set(combined.map((r) => r.content.text)));
return unique.slice(0, 5);
}
Re-ranking
Re-rank results for better relevance:async function rerank(
runtime: AgentRuntime,
query: string,
results: string[]
): Promise<string[]> {
// Use LLM to score relevance
const scores = await Promise.all(
results.map(async (result) => {
const response = await runtime.useModel("TEXT_SMALL", {
prompt: `Rate how relevant this passage is to the query on a scale of 0-10.
Query: ${query}
Passage: ${result}
Respond with just a number (0-10):`,
});
return { result, score: parseFloat(String(response)) || 0 };
})
);
// Sort by score
scores.sort((a, b) => b.score - a.score);
return scores.map((s) => s.result);
}
Document Processing
PDF Support
bun add pdf-parse
import pdf from "pdf-parse";
async function processPDF(filePath: string): Promise<string> {
const dataBuffer = readFileSync(filePath);
const data = await pdf(dataBuffer);
return data.text;
}
Web Scraping
bun add cheerio
import { load } from "cheerio";
async function scrapeWebPage(url: string): Promise<string> {
const response = await fetch(url);
const html = await response.text();
const $ = load(html);
// Extract main content
$('script, style, nav, footer').remove();
const text = $('body').text().trim();
return text;
}
URL Ingestion
async function ingestURL(
runtime: AgentRuntime,
url: string
) {
console.log(`🌐 Fetching ${url}...`);
const content = await scrapeWebPage(url);
const chunks = chunkText(content, 500);
for (let i = 0; i < chunks.length; i++) {
const embedding = await runtime.embed(chunks[i]);
await runtime.knowledgeManager!.createMemory({
id: uuidv4() as UUID,
roomId: stringToUuid("knowledge-base"),
entityId: stringToUuid("system"),
content: {
text: chunks[i],
metadata: {
source: url,
chunkIndex: i,
},
},
embedding,
});
}
console.log(`✅ Ingested ${url}`);
}
Optimization
Chunk Size Selection
Choose appropriate chunk sizes:const CHUNK_CONFIG = {
technical: 800, // Longer for technical docs
faq: 300, // Shorter for Q&A
narrative: 1000, // Longer for stories
};
Overlap Strategy
Add overlap between chunks for context:function chunkWithOverlap(
text: string,
chunkSize: number,
overlap: number = 100
): string[] {
const chunks: string[] = [];
let start = 0;
while (start < text.length) {
const end = Math.min(start + chunkSize, text.length);
chunks.push(text.slice(start, end));
start += chunkSize - overlap;
}
return chunks;
}
Caching
Cache embeddings to avoid recomputation:const embeddingCache = new Map<string, number[]>();
async function getCachedEmbedding(
runtime: AgentRuntime,
text: string
): Promise<number[]> {
const cached = embeddingCache.get(text);
if (cached) return cached;
const embedding = await runtime.embed(text);
embeddingCache.set(text, embedding);
return embedding;
}
Best Practices
Quality Over Quantity: Better to have well-curated, high-quality documents than massive amounts of noisy data.
Test Retrieval: Manually test searches to ensure relevant chunks are being retrieved.
Update Regularly: Keep your knowledge base current by re-ingesting updated documents.
Monitor Performance: Track which queries work well and which need improvement.
Context Window: Ensure retrieved context fits within your model’s context window.
Next Steps
Custom Character
Create specialized knowledge agents
Multi-Agent
Build RAG systems with multiple agents
REST API Server
Expose RAG via HTTP endpoints
RAG Guide
Complete RAG implementation guide