RAG

Overview

The RAG (Retrieval-Augmented Generation) class provides a lightweight knowledge retrieval system that enhances lead evaluations with relevant domain-specific context. It loads markdown files from a knowledge directory and uses keyword-based ranking to retrieve the most relevant documents for a given query.

Constructor

RAG()

knowledge_dir

string

default:"knowledge"

Directory path containing markdown files (.md) to load into the knowledge corpus. Relative paths are resolved from the current working directory.

from rag import RAG

# Use default knowledge directory
rag = RAG()

# Custom knowledge directory
rag = RAG(knowledge_dir="docs/context")

# Absolute path
rag = RAG(knowledge_dir="/path/to/knowledge")

Automatic Initialization: The constructor automatically loads all .md files from the knowledge directory into memory. No separate loading step is required.

The knowledge corpus is loaded once at initialization and stored in memory for fast retrieval.

Properties

knowledge_dir

string

Path to the knowledge directory

corpus

list

List of loaded knowledge documents. Each document is a dict with:

Show Document Structure

source

string

Filename of the markdown file (e.g., “sales-best-practices.md”)

content

string

Full text content of the markdown file

Methods

retrieve()

Retrieves the most relevant knowledge documents for a given query.

query_text

string

required

The query text to search for. Typically website content or a specific question.

limit

number

default:"3"

Maximum number of documents to return. Returns fewer if insufficient relevant documents are found.

Returns:

results

list

List of relevant document contents (strings) ranked by relevance score. Returns an empty list if no relevant documents are found.

from rag import RAG

rag = RAG()

# Retrieve relevant context
query = "We need help automating our sales process and lead qualification."
results = rag.retrieve(query, limit=3)

print(f"Found {len(results)} relevant documents")
for i, content in enumerate(results, 1):
    print(f"\n--- Document {i} ---")
    print(content[:200] + "...")

The method returns only the document contents (strings), not the full document objects. This format is optimized for direct injection into LLM prompts.

Usage Examples

Basic Retrieval

from rag import RAG

rag = RAG()

# Query with business context
query = "Digital marketing agency specializing in social media"
results = rag.retrieve(query)

if results:
    print(f"Retrieved {len(results)} relevant documents")
    for doc in results:
        print(f"\nDocument length: {len(doc)} chars")
        print(f"Preview: {doc[:150]}...")
else:
    print("No relevant context found")

Integration with Evaluator

from rag import RAG
from evaluator import Evaluator

rag = RAG()
evaluator = Evaluator()

content = """
We are a medical clinic in Yangon offering general practice, 
diagnostics, and preventive care services.
"""

# Retrieve relevant knowledge
context = rag.retrieve(content, limit=3)

# Enhance evaluation with context
result = evaluator.evaluate(content, rag_context=context)

print(f"Evaluation with {len(context)} context documents")
print(f"Fit Score: {result['fit_score']}")
print(f"Reasoning: {result['reasoning']}")

Custom Knowledge Directory

from rag import RAG
import os

# Set up custom knowledge base
knowledge_path = "sales_playbooks"
rag = RAG(knowledge_dir=knowledge_path)

print(f"Loaded {len(rag.corpus)} documents from {knowledge_path}")
for doc in rag.corpus:
    print(f"  - {doc['source']}: {len(doc['content'])} chars")

# Retrieve from custom corpus
results = rag.retrieve("B2B software sales", limit=5)
print(f"\nRetrieved {len(results)} relevant documents")

Checking Corpus Contents

from rag import RAG

rag = RAG()

print("Knowledge Corpus:")
print(f"Total documents: {len(rag.corpus)}\n")

for doc in rag.corpus:
    print(f"Source: {doc['source']}")
    print(f"Length: {len(doc['content'])} characters")
    print(f"Preview: {doc['content'][:100]}...")
    print("-" * 40)

Adjusting Retrieval Limit

from rag import RAG

rag = RAG()
query = "Healthcare services in Southeast Asia"

# Get different amounts of context
for limit in [1, 3, 5]:
    results = rag.retrieve(query, limit=limit)
    total_chars = sum(len(r) for r in results)
    print(f"Limit {limit}: {len(results)} docs, {total_chars} total chars")

Handling Empty Results

from rag import RAG

rag = RAG()

# Query that may not match anything
query = "Quantum computing blockchain AI"
results = rag.retrieve(query)

if not results:
    print("No relevant context found. Proceeding without RAG enhancement.")
    # Continue with evaluation without context
else:
    print(f"Found {len(results)} relevant documents")
    # Use the context

Ranking Algorithm

The RAG class uses a keyword-based scoring system:

Text Normalization

Converts query and documents to lowercase and splits into word sets

Intersection Scoring

Base score = number of common words between query and document

Long Word Boost

Adds 0.5 points for each long word (>4 chars) that appears in the document

High-Value Word Fallback

If no matches found, searches for domain-specific keywords (“yangon”, “bangkok”, “medical”, “education”, “marketing”, “digital”)

Deduplication

Removes duplicate documents from results

Ranking

Sorts documents by score (descending) and returns top N

This simple keyword approach is fast and requires no external dependencies or ML models, making it ideal for small to medium knowledge bases.

Scoring Examples

How different queries score:

# Query: "marketing automation sales"
# Document: "Marketing automation helps sales teams..."
# Score: 3 (base) + 1.5 (marketing, automation, sales > 4 chars) = 4.5

# Query: "food restaurant dining"
# Document: "Best practices for restaurant operations..."
# Score: 1 (restaurant) + 0.5 (restaurant boost) = 1.5

# Query: "medical clinic yangon"
# Document: "Healthcare services in Yangon..."
# Score: 2 (medical, yangon) + 1.0 (both > 4 chars) = 3.0

Knowledge Base Setup

Organize your knowledge directory with focused markdown files:

knowledge/
├── sales-automation.md       # Sales process and automation best practices
├── lead-qualification.md     # Lead scoring and qualification guidelines
├── industry-healthcare.md    # Healthcare industry insights
├── industry-education.md     # Education industry insights
├── geography-myanmar.md      # Myanmar market context
└── geography-thailand.md     # Thailand market context

Example markdown file (sales-automation.md):

# Sales Automation Best Practices

Businesses that struggle with manual sales processes benefit greatly from automation.
Key indicators: manual data entry, inconsistent follow-up, lack of lead tracking.

Recommended approach: Start with CRM integration and automated lead scoring.

Keep knowledge files focused on specific topics for better retrieval accuracy. Split large documents into smaller, topic-specific files.

Performance Characteristics

Initialization: O(n) where n = number of markdown files (one-time cost)
Retrieval: O(m × k) where m = corpus size, k = average document length
Memory Usage: All documents loaded into memory (~1-5 MB typical)
Latency: <10ms for typical queries with corpus size <100 documents

For very large knowledge bases (>1000 documents), consider implementing a more sophisticated retrieval system with vector embeddings or full-text search.

Empty Corpus Handling

If the knowledge directory is empty or doesn’t exist:

rag = RAG(knowledge_dir="nonexistent")
print(len(rag.corpus))  # 0

results = rag.retrieve("any query")
print(results)  # []

The system gracefully handles empty corpora and returns empty lists.

File Loading

The RAG class uses Python’s glob module to find markdown files:

import glob
import os

files = glob.glob(os.path.join(knowledge_dir, "*.md"))

Loads only .md files (case-sensitive)
Non-markdown files are ignored
Subdirectories are not searched (flat directory structure only)
Files are loaded with UTF-8 encoding

If you need recursive directory search, modify the _load_corpus() method to use "**/*.md" with recursive=True.

Retrieval Best Practices

Query Length
Document Size
Limit Setting

Optimal Query Length: 50-200 words

Too short (< 20 words): May not provide enough keywords for matching
Too long (> 500 words): Processing time increases without significant benefit
Sweet spot: A paragraph summarizing the business or question

Choosing the Right Limit:

limit=1: Quick single-context injection
limit=3: Balanced (default, recommended)
limit=5+: Comprehensive but increases token usage

Consider LLM context window limits when setting high limits.

Integration Patterns

Optional Enhancement Pattern

from rag import RAG
from evaluator import Evaluator

evaluator = Evaluator()
rag = RAG()

# Try to get context, but don't fail if RAG fails
try:
    context = rag.retrieve(content)
except Exception as e:
    print(f"RAG failed: {e}. Continuing without context.")
    context = None

# Evaluate with or without context
result = evaluator.evaluate(content, rag_context=context)

Conditional RAG Pattern

from rag import RAG

rag = RAG()

# Only use RAG for specific business types
if business_type in ["healthcare", "education", "legal"]:
    context = rag.retrieve(content, limit=5)
else:
    context = None

result = evaluator.evaluate(content, rag_context=context)

Command-Line Testing

Test the RAG system standalone:

python rag.py

Output:

Retrieved 3 items:
--- Result 1 ---
Sales automation helps businesses streamline their lead qualification process...
--- Result 2 ---
Healthcare providers in Southeast Asia often need...
--- Result 3 ---
Education institutions benefit from digital marketing...

LeadEngine - Integrates RAG into the full pipeline
Evaluator - Consumes RAG context for enhanced evaluations

Core Components

Integrations

Overview

Constructor

RAG()

Properties

Methods

retrieve()

Usage Examples

Basic Retrieval

Integration with Evaluator

Custom Knowledge Directory

Checking Corpus Contents

Adjusting Retrieval Limit

Handling Empty Results

Ranking Algorithm

Scoring Examples

Knowledge Base Setup

Performance Characteristics

Empty Corpus Handling

File Loading

Retrieval Best Practices

Integration Patterns

Optional Enhancement Pattern

Conditional RAG Pattern

Command-Line Testing

Build docs developers (and LLMs) love

Core Components

Integrations

​Overview

​Constructor

​RAG()

​Properties

​Methods

​retrieve()

​Usage Examples

​Basic Retrieval

​Integration with Evaluator

​Custom Knowledge Directory

​Checking Corpus Contents

​Adjusting Retrieval Limit

​Handling Empty Results

​Ranking Algorithm

​Scoring Examples

​Knowledge Base Setup

​Performance Characteristics

​Empty Corpus Handling

​File Loading

​Retrieval Best Practices

​Integration Patterns

​Optional Enhancement Pattern

​Conditional RAG Pattern

​Command-Line Testing

​Related Components

Build docs developers (and LLMs) love

Overview

Constructor

RAG()

Properties

Methods

retrieve()

Usage Examples

Basic Retrieval

Integration with Evaluator

Custom Knowledge Directory

Checking Corpus Contents

Adjusting Retrieval Limit

Handling Empty Results

Ranking Algorithm

Scoring Examples

Knowledge Base Setup

Performance Characteristics

Empty Corpus Handling

File Loading

Retrieval Best Practices

Integration Patterns

Optional Enhancement Pattern

Conditional RAG Pattern

Command-Line Testing

Related Components