Skip to main content
Quark is a Retrieval-Augmented Generation (RAG) system that lets you ingest PDFs and documents, then ask questions about them in natural language. It processes both text and images, stores context across sessions using a dual-memory layer, and always cites the source page for every answer.

Quick start

Set up Quark and chat with your first document in minutes.

Architecture

Understand the ingestion pipeline, memory system, and vector search.

CLI usage

Learn all CLI commands for ingesting documents and querying them locally.

API reference

Integrate Quark into your application via the REST API.

How it works

1

Ingest your documents

Point Quark at a PDF. It extracts text and images using Unstructured.io and pdfplumber, then embeds everything into a Qdrant vector database.
2

Ask a question

Type your question in the CLI or send it to the /chat/completions endpoint. Quark retrieves the most relevant chunks and re-ranks them for precision.
3

Get a grounded answer

The LLM responds using only your document content, citing sources by page number. Short-term context is kept in Redis; long-term memory is compressed into Mem0.

Key features

Multimodal ingestion

Processes text, tables, and images from PDFs. Visual elements are described by a vision LLM and made searchable.

Dual-layer memory

Redis handles short-term session context. Mem0 stores long-term user preferences and history across sessions.

Re-ranked retrieval

VoyageAI embeddings combined with a re-ranking pass deliver highly relevant context to the LLM.

Grounded responses

The LLM is instructed to use only your document content and cite every source — no hallucinations.

CLI interface

A full-featured terminal UI with session management, ingest tracking, and chat history — no browser needed.

REST API

An Elysia-powered HTTP server exposes ingestion, retrieval, and session management endpoints for programmatic access.

Build docs developers (and LLMs) love