Architecture

System Overview

Khoj is a full-stack AI application with a Python backend, Next.js frontend, and multiple client interfaces. It uses PostgreSQL with pgvector for semantic search and supports various AI models through a plugin architecture. Khoj Architecture

High-Level Architecture

Client Layer

Multiple client interfaces communicate with the Khoj server:

Web App (Next.js)
Desktop App (Electron)
Obsidian Plugin
Emacs Package
Android App
API Clients

API Layer

FastAPI-based REST API handles:

Authentication & Authorization
Chat & Search Endpoints
Content Indexing
Agent Management
Automation Triggers

Processing Layer

Core business logic for:

Document Processing
Embedding Generation
Conversation Management
Tool Execution
Research Mode

Storage Layer

PostgreSQL with pgvector for embeddings
Django ORM for database management
File system for static assets

External Services

LLM APIs (OpenAI, Anthropic, Google, etc.)
Speech Recognition (Whisper)
Image Generation
Web Search

Backend Architecture

Directory Structure

src/khoj/
├── app/                    # Django application
│   ├── settings.py        # Django settings
│   ├── urls.py            # URL routing
│   └── wsgi.py            # WSGI application
├── database/              # Database layer
│   ├── models/           # Django models
│   ├── adapters/         # Database adapters
│   └── migrations/       # Schema migrations
├── interface/             # Client interfaces
│   └── built/            # Compiled frontend assets
├── processor/             # Core processing logic
│   ├── content/          # Content processors
│   ├── conversation/     # Chat & conversation
│   ├── embeddings.py     # Embedding generation
│   ├── tools/            # Agent tools
│   └── speech/           # Speech processing
├── routers/               # FastAPI routers
│   ├── api.py            # Main API endpoints
│   ├── api_chat.py       # Chat endpoints
│   ├── api_agents.py     # Agent management
│   ├── api_content.py    # Content indexing
│   ├── auth.py           # Authentication
│   └── helpers.py        # Shared utilities
├── search_filter/         # Search filters
├── search_type/           # Search implementations
├── utils/                 # Utility functions
│   ├── helpers.py        # Helper functions
│   ├── initialization.py # App initialization
│   └── state.py          # Application state
├── configure.py           # Configuration logic
└── main.py               # Application entry point

Core Components

main.py - Application Bootstrap

The entry point that:

Initializes Django
Sets up FastAPI application
Configures CORS middleware
Runs database migrations
Collects static files
Starts the uvicorn server

# Initialize Django
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "khoj.app.settings")
django.setup()

# Initialize FastAPI
app = FastAPI()

# Configure routes and middleware
configure_routes(app)
configure_middleware(app)

routers/ - API Endpoints

FastAPI routers organized by domain:

api.py: Core API (health, config, sync)
api_chat.py: Chat & conversation (70KB+, main chat logic)
api_agents.py: Agent creation and management
api_content.py: Content upload and indexing
api_automation.py: Scheduled automations
api_subscription.py: User subscriptions
auth.py: OAuth, login, registration
helpers.py: Shared router utilities (130KB+)

Example endpoint structure:

@api_chat.post("/chat")
async def chat(
    request: Request,
    q: str,
    conversation_id: str = None,
    user: KhojUser = Depends(get_current_user)
):
    # Chat logic

processor/ - Business Logic

Contains core processing modules:Content Processors (processor/content/):

PDF, Markdown, Org-mode, Word, Notion
Image, plaintext, Github, Docx

Conversation (processor/conversation/):

LLM integrations (OpenAI, Anthropic, Google, Offline)
Prompt management
Context building
Tool calling

Tools (processor/tools/):

Web search
Code execution
Document retrieval
Custom agent tools

Embeddings (processor/embeddings.py):

Text embedding generation
Batch processing
Model management

database/ - Data Layer

Django-based database management:Models (database/models/):

User, Subscription
Conversation, ChatModel
Agent, Entry
SearchModel, Automation
ProcessLock

Adapters (database/adapters/):

High-level database operations
Business logic for data access
Caching and optimization

Migrations (database/migrations/):

Schema version control
Database evolution

utils/ - Utilities

Shared utility modules:

helpers.py: Common helper functions (47KB)
initialization.py: App startup logic
state.py: Application state management
cli.py: Command-line interface
constants.py: Application constants

Frontend Architecture

Web Application

src/interface/web/
├── app/                   # Next.js App Router
│   ├── (app)/            # Main application routes
│   ├── api/              # API routes
│   ├── layout.tsx        # Root layout
│   └── page.tsx          # Home page
├── components/            # React components
│   ├── ui/               # UI primitives (Radix)
│   ├── chat/             # Chat components
│   ├── agents/           # Agent management
│   └── settings/         # Settings UI
├── lib/                   # Utilities
│   ├── utils.ts          # Helper functions
│   └── api.ts            # API client
├── public/                # Static assets
├── styles/                # Global styles
└── package.json           # Dependencies

Key Frontend Technologies

Next.js 14

App Router for file-based routing
Server-side rendering (SSR)
API routes for backend proxying
Static export for production

TypeScript

Type-safe React components
API client with full typing
Enhanced IDE support

Tailwind CSS

Utility-first styling
Responsive design
Custom design system

Radix UI

Accessible primitives
Dialog, Dropdown, Tooltip
Form components

State Management

SWR: Data fetching and caching
React Context: Global state (user, settings)
URL State: Search params for routing
WebSockets: Real-time chat streaming

Client Architecture

Desktop App
Obsidian Plugin
Emacs Package
Android App

Technology: Electron + ReactLocation: src/interface/desktop/Features:

Tray icon integration
System notifications
Auto-start on login
Local file indexing
Embedded web view

Structure:

desktop/
├── main.js          # Electron main process
├── preload.js       # Preload script
├── renderer/        # React renderer
└── package.json

Technology: TypeScriptLocation: src/interface/obsidian/Features:

Inline chat
Search integration
Note indexing
Custom commands

Key Files:

main.ts: Plugin entry point
src/chat_view.ts: Chat interface
src/search_view.ts: Search UI
manifest.json: Plugin metadata

Technology: Emacs LispLocation: src/interface/emacs/Features:

Transient UI for commands
Org-mode integration
Search in buffers
Chat interface

Key Files:

khoj.el: Main package
khoj-chat.el: Chat functionality
khoj-search.el: Search features

Technology: KotlinLocation: src/interface/android/Features:

Native mobile UI
Voice input
Push notifications
File sharing

Structure:

android/
├── app/src/main/
│   ├── java/dev/khoj/
│   ├── res/
│   └── AndroidManifest.xml
└── build.gradle

Data Flow

Search & Retrieval

Query Ingestion

User submits search query through client → API endpoint

Embedding Generation

Query text converted to vector embedding using Sentence Transformers

Vector Search

pgvector performs cosine similarity search in PostgreSQL

Re-ranking

Top results re-ranked using cross-encoder model for accuracy

Filter Application

Apply date, file, and content filters to results

Response

Ranked results returned to client with metadata

Chat Conversation

Message Ingestion

User message received at /api/chat endpoint

Context Retrieval

Relevant documents retrieved using semantic search

Tool Planning

Agent determines which tools to use (search, calculate, etc.)

Tool Execution

Tools executed in sequence, results collected

Prompt Construction

System prompt + conversation history + context + tool results

LLM Generation

Prompt sent to configured LLM (OpenAI, Anthropic, etc.)

Streaming Response

Response streamed back to client via WebSocket/SSE

History Storage

Conversation saved to database for future context

Content Indexing

Content Upload

Files/URLs uploaded through /api/content

Format Detection

File type detected using mime type and magika

Content Extraction

Appropriate processor extracts text (PDF, Markdown, etc.)

Chunking

Text split into semantic chunks using LangChain

Embedding Generation

Each chunk converted to vector embedding

Storage

Embeddings and metadata stored in PostgreSQL with pgvector

Indexing Complete

Content now searchable and available for chat context

Database Schema

Key Models

User & Authentication

class KhojUser(AbstractUser):
    uuid = models.UUIDField(default=uuid.uuid4)
    email = models.EmailField(unique=True)
    phone_number = models.CharField()
    is_active = models.BooleanField(default=True)
    subscription = models.ForeignKey(Subscription)

Conversation

class Conversation(models.Model):
    user = models.ForeignKey(KhojUser)
    conversation_log = models.JSONField(default=dict)
    agent = models.ForeignKey(Agent, null=True)
    title = models.CharField()
    created_at = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)

Entry (Indexed Content)

class Entry(models.Model):
    user = models.ForeignKey(KhojUser)
    embeddings = VectorField(dimensions=384)  # pgvector
    raw = models.TextField()
    compiled = models.TextField()
    heading = models.CharField()
    file_source = models.CharField()
    file_type = models.CharField()
    created_at = models.DateTimeField(auto_now_add=True)

Agent

class Agent(models.Model):
    creator = models.ForeignKey(KhojUser)
    name = models.CharField()
    personality = models.TextField()
    chat_model = models.ForeignKey(ChatModelOptions)
    tools = models.JSONField(default=list)
    files = models.ManyToManyField(Entry)
    public = models.BooleanField(default=False)

Testing Architecture

Test Organization

tests/
├── conftest.py              # Pytest fixtures
├── helpers.py               # Test utilities
├── test_client.py           # API endpoint tests
├── test_agents.py           # Agent functionality
├── test_conversation_utils.py
├── test_date_filter.py
├── test_file_filter.py
├── test_grep_files.py
├── data/                    # Test data
└── evals/                   # Evaluation tests

Testing Stack

pytest

Unit and integration tests
Fixtures for database setup
Parallel test execution

pytest-django

Django test database
Model factories
Transaction management

pytest-asyncio

Async endpoint testing
WebSocket tests

factory-boy

Test data generation
Model factories

CI/CD Pipeline

GitHub Actions Workflows

test.yml
pre-commit.yml
dockerize.yml
pypi.yml
desktop.yml

Runs on every PR and push to master:

Matrix testing on Python 3.10, 3.11, 3.12
PostgreSQL service with pgvector
Full test suite execution
Type checking with mypy

Performance Considerations

Embedding Caching

Embeddings cached to avoid recomputation. Incremental updates only process changed content.

Connection Pooling

PostgreSQL connection pooling via Django to handle concurrent requests efficiently.

Async Processing

FastAPI async endpoints allow non-blocking I/O for external API calls.

Vector Indexing

pgvector indexes optimize similarity search performance for large datasets.

For detailed performance metrics, see the Performance Guide.

Security Architecture

Authentication

OAuth 2.0 (Google, GitHub)
JWT-based sessions
Magic link login

Authorization

User-scoped data access
Agent permission model
Public vs private agents

Data Protection

HTTPS-only in production
Environment variable secrets
Secure password hashing

Input Validation

Pydantic models for API validation
File upload size limits
Content sanitization

Extension Points

Adding New Content Types

Create processor in processor/content/
Implement ContentProcessor interface
Register in content type registry
Add tests in tests/

Adding New LLM Providers

Create provider class in processor/conversation/
Implement LLMProvider interface
Add configuration in settings
Update model selection UI

Adding New Tools

Create tool in processor/tools/
Define tool schema
Register in agent toolset
Add to tool selection UI

Visual References

Codebase Visualization

Interactive visualization of the entire codebase structure

Architecture Diagram

Refer to “ for the complete system diagram

Additional Resources

Development Setup

Set up your local environment

Performance Guide

Optimization best practices

API Documentation

Complete API reference

GitHub Repository

View the source code

Development

​System Overview

​High-Level Architecture

​Backend Architecture

​Directory Structure

​Core Components

​Frontend Architecture

​Web Application

​Key Frontend Technologies

Next.js 14

TypeScript

Tailwind CSS

Radix UI

​State Management

​Client Architecture

​Data Flow

​Search & Retrieval

​Chat Conversation

​Content Indexing

​Database Schema

​Key Models

​Testing Architecture

​Test Organization

​Testing Stack