Skip to main content
Ingestion runs a document through the full pipeline: Unstructured.io partitions the text, pdfplumber extracts images, VoyageAI embeds each chunk, and the resulting vectors are upserted into Qdrant.
Ingestion requires a valid Unstructured.io API key and URL configured in your .env file. The pipeline exits with an error if this is not set. See Configuration for details.

CLI ingestion

Use the :ingest command inside the running CLI:
:ingest <path> [institution] [course]
1

Run the ingest command

Provide a path to your PDF. Wrap paths that contain spaces in double quotes:
# Simple path
:ingest /home/user/notes.pdf

# Path with spaces
:ingest "/home/user/lecture notes/week3.pdf" MIT biology

# Full metadata
:ingest /home/user/chem101.pdf Stanford chemistry
2

Wait for ingestion to complete

A progress indicator is shown while the document is processed. When ingestion finishes, Quark prints the chunk counts:
✓  notes.pdf — 42 chunks  3 visual
3

Verify with :docs

Run :docs to confirm the document appears in the active session’s ingest log.

Arguments

ArgumentRequiredDefaultDescription
pathYesAbsolute or relative path to the PDF. Quote if the path contains spaces.
institutionNo"Default"Tag used to scope vector search. Persists for the rest of the session.
courseNo(none)Free-text label for the course or topic.
Ingest multiple documents tagged with the same institution value, then use that value when querying to scope answers to a specific document set.

Build docs developers (and LLMs) love