All services that Quark depends on are free to use and do not require a credit card for registration.
What Quark does
Quark solves a fundamental problem with general-purpose LLMs: they make things up. When you ask a question about a document, a standard chatbot may confidently answer with information that was never in the file. Quark prevents this by constraining the LLM to only what it retrieved from your ingested documents, and by citing the source for every claim. Beyond retrieval accuracy, Quark is aware of context across a session (Redis short-term memory) and across sessions (Mem0 long-term memory), so it remembers your preferences and prior conversations without you repeating yourself.Key capabilities
Multimodal document ingestion
Parses PDFs for both text and images. Text is partitioned by Unstructured.io; images are extracted by pdfplumber. A custom sync layer aligns both modalities before embedding.
Dual-layer memory
Short-term memory (STM) is handled by Redis for rapid in-session context. Long-term memory (LTM) is powered by Mem0 to persist user history and preferences across sessions.
Vector search with Qdrant
Document chunks are embedded with VoyageAI and stored in Qdrant. At query time, the most relevant vectors are retrieved and passed to the LLM as context.
CLI interface
A full-featured terminal UI for ingesting documents, managing sessions, and chatting — no browser required.
REST API
An Elysia-powered HTTP server exposes ingestion, retrieval, and session management endpoints for programmatic access.
Local chat history
All conversation logs are stored in a local SQLite database. Your data stays on your system.
Services Quark relies on
Quark is assembled from several best-in-class services. You will need an account for each one to run the full system.| Service | Role |
|---|---|
| Groq (or any OpenAI-compatible provider) | LLM inference |
| VoyageAI | Text embeddings |
| Unstructured.io | Document partitioning (text + tables) |
| Qdrant | Vector database |
| Mem0 | Long-term memory |
| Upstash Redis | Short-term session memory |
| ElasticLake | Object storage |
| Supabase | Relational database |
Architecture at a glance
Quark follows a Modular RAG pattern. Ingestion of images and text is decoupled and re-synced at the metadata level, preserving higher contextual integrity than text-only pipelines. The dual-memory layer mimics human cognitive function by separating immediate recall (Redis) from historical knowledge (Mem0). When you ingest a document, Quark runs it through a multi-stage pipeline:- Parse — Unstructured.io splits text; pdfplumber extracts images.
- Sync — A custom sync layer aligns text and image chunks by position.
- Embed — VoyageAI converts each chunk into a dense vector.
- Store — Vectors are upserted into Qdrant; objects go to ElasticLake.
- Embeds your query with VoyageAI.
- Retrieves the top matching chunks from Qdrant.
- Hydrates the prompt with STM context from Redis and LTM context from Mem0.
- Sends the grounded prompt to the LLM and streams the response.