Three-phase pipeline
Planning → Researching → Reporting with six specialized agents.
Parallel execution
Research up to 5 subtopics concurrently with thread-safe citation management.
Unified citations
Every source gets a unique ID (
PLAN-XX or CIT-X-XX) tracked in a central registry.Research presets
Choose quick, medium, deep, or auto to control scope and depth.
How to use
Enter a topic
Type your research topic in the input field. The
RephraseAgent may prompt you to refine the topic before proceeding (configurable).Choose a research preset
Select Quick, Medium, Deep, or Auto based on how much depth you need. See presets for details.
Monitor progress
Watch real-time progress across all three phases. In parallel mode, multiple subtopics display concurrent status indicators.
Research presets
Each preset is a named configuration that controls how many subtopics are generated and how many research iterations run per subtopic.| Preset | Subtopics | Iterations per subtopic | Iteration mode | Minimum report section |
|---|---|---|---|---|
quick | 2 | 2 | Fixed | 300 words |
medium | 5 | 4 | Fixed | 500 words |
deep | 8 | 7 | Fixed | 800 words |
auto | up to 8 | up to 6 | Flexible | 500 words |
- Fixed mode: the agent always completes all iterations regardless of how much it has found.
- Flexible mode (
autoonly): the agent stops early when it judges the collected knowledge sufficient.
Three-phase architecture
Phase 1 — Planning
The pipeline starts by turning your input into a structured research plan. RephraseAgent optimizes your topic for research with support for multi-turn refinement. The output is a focused topic description (200–400 words) that guides subsequent decomposition. DecomposeAgent breaks the topic into subtopics. It operates in one of two modes:- Manual mode: generates a fixed number of RAG sub-queries, retrieves background context for each, then produces exactly that many subtopics.
- Auto mode: issues a single RAG query and lets the LLM identify the most relevant subtopics up to a configured maximum.
TopicBlock with status PENDING and added to the DynamicTopicQueue.
Phase 2 — Researching
The queue drives an iterative research loop. EachTopicBlock moves through states: PENDING → RESEARCHING → COMPLETED (or FAILED).
ManagerAgent handles scheduling: it picks the next pending block, marks it as active, and records completion or failure. It also accepts dynamically discovered subtopics added during research.
ResearchAgent runs the per-topic loop. Each iteration it:
- Checks whether accumulated knowledge is sufficient (
check_sufficiency). - Plans the next query: selects a tool, writes the query, and optionally flags a newly discovered related topic for the queue.
- Calls the selected tool.
ToolTrace that carries the pre-assigned citation_id.
Execution modes
- Series
- Parallel
Topics are researched one at a time. Simpler, lower resource usage. Set via:
Phase 3 — Reporting
ReportingAgent builds the final document:- Deduplication — semantically similar blocks are identified and merged.
- Outline generation — a three-level heading structure (title → sections → subsections) is produced.
- Report writing — the LLM writes each section using a citation table that maps
[N]references to source details. - Post-processing — inline
[N]markers are converted to clickable[[N]](#ref-N)anchor links and invalid references are removed. - References section — each citation gets an academic-style entry with a collapsible source detail block.
Tool integrations
TheResearchAgent selects tools dynamically based on iteration phase and research gaps.
| Tool | Type | Use case |
|---|---|---|
rag_hybrid | RAG | Comprehensive knowledge retrieval from your knowledge base |
rag_naive | RAG | Basic vector search against your knowledge base |
query_item | Entity lookup | Retrieve a specific entity by ID (e.g., Theorem 3.1) |
paper_search | External | Academic paper search (limited to the past 3 years by default) |
web_search | External | Real-time web results |
run_code | Code execution | Calculations and data analysis |
web_search has two switches. Both must be enabled for web search to run:
tools.web_search.enabled (global) and research.researching.enable_web_search (module-level).Citation system
Every source gets a unique ID issued byCitationManager:
PLAN-01,PLAN-02, … — RAG queries run during the planning phase.CIT-1-01,CIT-2-03, … — tool calls during research. The first number is the block index; the second is a per-block sequence counter.
[[1]](#ref-1) anchor links that jump to the references section. The citation registry is persisted to citations.json so sessions can be resumed.
CLI usage
Run research directly from the terminal without the web UI. All commands must be run from the project root.Python API
Output files
All research output is written todata/user/research/.
Configuration reference
Settings live in two files:config/main.yaml (pipeline behaviour) and config/agents.yaml (LLM parameters).
Planning phase
Planning phase
Researching phase
Researching phase
Queue
Queue
Reporting phase
Reporting phase
RAG source
RAG source
LLM parameters
LLM parameters