Overview
The Deep Research API enables iterative, hypothesis-driven scientific research. Unlike the basic chat API, deep research runs multiple autonomous cycles of planning, execution, hypothesis generation, and reflection.
Key Capabilities
- Autonomous Iteration: Runs multiple research cycles until objectives are met
- Hypothesis Evolution: Updates hypotheses based on new evidence
- Discovery Tracking: Identifies and links novel findings to supporting evidence
- Clarification Context: Accepts pre-approved research plans from clarification sessions
- Research Modes: Semi-autonomous, fully-autonomous, or steering modes
Authentication
All deep research endpoints require authentication.
Bearer token for JWT authentication
Alternative API key authentication
POST /api/deep-research/start
Initiate a new deep research investigation.
Request Body
The research question or objective
Optional conversation ID. Auto-generated if not provided.
Research execution mode:
semi-autonomous (default): Runs up to MAX_AUTO_ITERATIONS (default 5)
fully-autonomous: Continues until complete or 20 iterations
steering: Single iteration only, always asks for user feedback
Optional clarification session ID containing pre-approved plan and answers
Optional files to upload for analysis
Response (In-Process Mode)
Returns 202 Accepted immediately. Research runs in background.
Message ID for status polling
Conversation ID for tracking
User ID for ownership verification
Initial status: processing
URL to check research progress
Response (Queue Mode)
Returns 202 Accepted with job information.
BullMQ job ID for the first iteration
Root message ID for tracking
Example: Start Research
curl -X POST https://api.bioagents.xyz/api/deep-research/start \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"message": "Investigate the molecular mechanisms by which caloric restriction extends lifespan",
"researchMode": "semi-autonomous"
}'
Example Response
{
"messageId": "msg_abc123",
"conversationId": "conv_xyz789",
"userId": "usr_def456",
"status": "processing",
"pollUrl": "/api/deep-research/status/msg_abc123"
}
Example: With Clarification Context
curl -X POST https://api.bioagents.xyz/api/deep-research/start \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"message": "Execute the approved research plan",
"clarificationSessionId": "clarif_abc123",
"researchMode": "fully-autonomous"
}'
Example: With File Upload
curl -X POST https://api.bioagents.xyz/api/deep-research/start \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-F "message=Analyze gene expression changes under caloric restriction" \
-F "files=@gene_expression.csv" \
-F "files=@lifespan_data.csv" \
-F "researchMode=semi-autonomous"
Research Workflow
Each iteration follows this sequence:
- Planning: Determines tasks to execute based on current state
- Execution: Runs literature search and data analysis tasks in parallel
- Hypothesis: Generates or updates scientific hypothesis
- Reflection: Extracts insights and evolves research objectives
- Discovery: Identifies novel findings with evidence links
- Continuation: Decides whether to continue or ask user for guidance
World State
The conversation state accumulates knowledge across iterations:
currentObjective: Current research focus
evolvingObjective: Long-term research direction
currentHypothesis: Latest scientific hypothesis
keyInsights: Extracted insights from all tasks
discoveries: Novel findings with evidence attribution
methodology: Research methodology description
plan: All executed and planned tasks
suggestedNextSteps: Agent’s suggestions for next iteration
Deduplication
The API prevents duplicate research runs for the same conversation:
- If an active run exists, returns the existing
messageId and pollUrl
- Uses mutex locks to prevent race conditions
- Response includes
deduplicated: true flag
Example: Deduplicated Response
{
"messageId": "msg_existing",
"conversationId": "conv_xyz789",
"userId": "usr_def456",
"status": "processing",
"pollUrl": "/api/deep-research/status/msg_existing",
"deduplicated": true
}
Research Modes
Semi-Autonomous (Default)
Runs up to MAX_AUTO_ITERATIONS (default 5) before asking for user feedback.
{
"message": "Research question here",
"researchMode": "semi-autonomous"
}
Fully Autonomous
Continues iterating until research objectives are met or 20 iterations reached.
{
"message": "Research question here",
"researchMode": "fully-autonomous"
}
Steering Mode
Executes only a single iteration and always returns control to user.
{
"message": "Research question here",
"researchMode": "steering"
}
Clarification Integration
When starting research with a clarification session:
- User completes pre-research clarification flow
- Clarification session contains approved plan with initial tasks
- Pass
clarificationSessionId to skip planning on first iteration
- Initial tasks are executed with resolved dataset references
Clarification Context Structure
{
sessionId: string;
refinedObjective: string;
questionsAndAnswers: Array<{
question: string;
answer: string;
}>;
initialTasks?: Array<{
objective: string;
type: "LITERATURE" | "ANALYSIS";
datasetFilenames: string[];
}>;
}
Error Codes
- Missing required field: message
- Clarification session not approved
- Invalid research mode
Missing or invalid authentication
Clarification session not found or access denied
- Setup failed
- Data setup failed
- Failed to initialize deep research run
Task Types
LITERATURE Tasks
Searches scientific literature from multiple sources:
- OpenScholar: General scientific papers
- Edison/BioLit: Bioscience-specific literature
- Knowledge Base: Internal/custom knowledge
Each task captures:
objective: Search query
output: Formatted literature results with DOI citations
jobId: External service job ID for traceability
reasoning: Intermediate reasoning traces (for Edison/BioLit)
ANALYSIS Tasks
Executes data analysis on uploaded datasets:
- Supports CSV, Excel, and other tabular formats
- Runs statistical analysis and visualization
- Generates plots and figures
Each task captures:
objective: Analysis goal
datasets: Array of dataset references
output: Analysis results with interpretations
artifacts: Generated plots and files
jobId: Analysis job ID