RAG & Retrieval

The @deepagents/retrieval package provides a local-first RAG (Retrieval-Augmented Generation) system that ingests content from various sources, creates vector embeddings, and provides intelligent semantic search.

Overview

Build AI applications that can intelligently search and retrieve information from:

GitHub repositories and releases
RSS feeds and blog posts
Local files and PDFs
Linear issues
Any custom data source

Key Features

Connector Pattern Ingest from multiple sources using a unified interface. Each connector handles content fetching, processing, and metadata extraction. Local Embeddings Use FastEmbed for local embedding generation with models like BGE-Small-EN-V15. No external API calls required. SQLite Vector Storage Store embeddings in SQLite using sqlite-vec for efficient vector similarity search with cosine distance. Semantic Search Find relevant content using natural language queries. The system automatically handles embedding, chunking, and similarity ranking. Content Change Detection Automatic detection of content changes using SHA-256 hashing. Only re-processes documents when they’ve actually changed.

Quick Start

import { ingest, similaritySearch, fastembed } from '@deepagents/retrieval';
import { SqliteStore } from '@deepagents/retrieval';
import { github } from '@deepagents/retrieval/connectors';
import Database from 'better-sqlite3';

// Set up storage and embeddings
const db = new Database(':memory:');
const store = new SqliteStore(db, 384); // BGE-Small-EN-V15 dimensions
const embedder = fastembed({ model: 'BGESmallENV15' });

// Ingest content from GitHub
await ingest({
  connector: github.file('facebook/react/README.md'),
  store,
  embedder,
});

// Search for relevant content
const results = await similaritySearch('How do I get started with React?', {
  connector: github.file('facebook/react/README.md'),
  store,
  embedder,
});

console.log(results);

Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Connectors    │────│  Ingestion      │────│  Vector Store   │
│                 │    │                 │    │                 │
│ • GitHub        │    │ • Chunking      │    │ • SQLite        │
│ • RSS Feeds     │    │ • Embedding     │    │ • Cosine Search │
│ • Local Files   │    │ • Change Detect │    │ • Metadata      │
│ • PDF Documents │    │                 │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘

Core Concepts

Connectors Connectors implement the Connector interface and provide:

A unique sourceId for tracking content
An async generator that yields documents with id, content, and metadata
Optional ingestion strategies (never, contentChanged, expired)

Ingestion The ingestion process:

Fetches content from the connector
Splits content into chunks using text splitters
Generates embeddings for each chunk
Stores embeddings with metadata in SQLite

Similarity Search Search finds relevant content by:

Embedding the query text
Computing cosine similarity with stored embeddings
Ranking results by similarity score
Returning the top N matches

Use Cases

Documentation Search Ingest technical documentation and enable natural language search across your knowledge base. Code Intelligence Index your codebase and search for relevant code examples, patterns, or implementations. Content Monitoring Track RSS feeds, release notes, or blog posts and query for specific topics or updates. Issue Analysis Ingest Linear issues or GitHub issues and search for related bugs, features, or discussions.

Next Steps

Installation

Install and set up the retrieval package

Ingestion

Learn how to ingest documents

Connectors

Explore available connectors

Search

Perform semantic similarity search

Overview

Guides

Connectors

API Reference

Introduction to RAG & Retrieval

RAG & Retrieval

Overview

Key Features

Quick Start

Architecture

Core Concepts

Use Cases

Next Steps

Installation

Ingestion

Connectors

Search

Build docs developers (and LLMs) love

Overview

Guides

Connectors

API Reference

​RAG & Retrieval

​Overview

​Key Features

​Quick Start

​Architecture

​Core Concepts

​Use Cases

​Next Steps

Installation

Ingestion

Connectors

Search

Build docs developers (and LLMs) love

RAG & Retrieval

Overview

Key Features

Quick Start

Architecture

Core Concepts

Use Cases

Next Steps