Skip to main content
DeepTutor uses two configuration layers:
  1. Environment variables (.env) — API keys, model names, provider bindings, and ports. These are the authoritative source for secrets and model selection.
  2. YAML files (config/) — system behavior, tool settings, agent LLM parameters, and module-specific options. Never put secrets here.

Environment variables

Copy .env.example to .env and fill in the values before starting DeepTutor.
cp .env.example .env

Server ports

VariableRequiredDefaultDescription
BACKEND_PORTNo8001Port for the FastAPI backend server.
FRONTEND_PORTNo3782Port for the Next.js frontend server.

LLM settings

The primary language model used across all AI operations — chat, research, solving, and writing.
VariableRequiredExampleDescription
LLM_BINDINGYesopenaiProvider identifier. See supported values below.
LLM_MODELYesgpt-4oModel name as recognized by the provider API.
LLM_API_KEYYessk-...API key for your LLM provider.
LLM_HOSTYeshttps://api.openai.com/v1API endpoint URL.
LLM_API_VERSIONNo2024-02-15-previewAPI version string. Required for Azure OpenAI only.
Supported LLM_BINDING values: openai, azure_openai, anthropic, deepseek, openrouter, groq, together, mistral, ollama, lm_studio, vllm, llama_cpp

Embedding settings

The embedding model powers the RAG vector store used by every retrieval operation.
VariableRequiredExampleDescription
EMBEDDING_BINDINGYesopenaiProvider identifier. See supported values below.
EMBEDDING_MODELYestext-embedding-3-smallEmbedding model name.
EMBEDDING_API_KEYYessk-...API key for the embedding provider.
EMBEDDING_HOSTYeshttps://api.openai.com/v1Embedding API endpoint URL.
EMBEDDING_DIMENSIONYes1536Vector output dimension. Must match your model exactly.
EMBEDDING_API_VERSIONNo2024-02-15-previewAPI version. Required for Azure OpenAI only.
Supported EMBEDDING_BINDING values: openai, azure_openai, jina, cohere, huggingface, google, ollama, lm_studio
EMBEDDING_DIMENSION must match your model’s actual output size. Mismatched dimensions will cause knowledge base indexing to fail. Common values: text-embedding-3-small1536, text-embedding-3-large3072.

TTS settings

Text-to-speech settings power the Co-writer’s narration feature. All TTS variables are optional.
VariableRequiredDefaultExampleDescription
TTS_BINDINGNoopenaiProvider: openai or azure_openai.
TTS_MODELNotts-1TTS model name.
TTS_API_KEYNosk-...TTS provider API key. Can be the same as LLM_API_KEY for OpenAI.
TTS_URLNohttps://api.openai.com/v1TTS API endpoint URL.
TTS_VOICENoalloynovaDefault voice. Options: alloy, echo, fable, onyx, nova, shimmer.
TTS_BINDING_API_VERSIONNo2024-05-01-previewAPI version for Azure OpenAI TTS.

Search settings

Web search is optional. When configured, it enables real-time web results in the solver, deep research, and co-writer modules.
VariableRequiredDefaultExampleDescription
SEARCH_PROVIDERNoperplexitytavilySearch provider. Options: perplexity, tavily, serper, jina, exa.
SEARCH_API_KEYNopplx-...API key for your chosen search provider.

Cloud and remote access

VariableRequiredExampleDescription
NEXT_PUBLIC_API_BASE_EXTERNALNohttps://your-server.com:8001Public backend URL for cloud deployments. The frontend browser uses this to reach the API.
NEXT_PUBLIC_API_BASENohttp://192.168.1.100:8001Direct API base URL. Use this for LAN access from another device.
When running locally with a single device, you do not need to set either of these. They are only required when the browser accessing the frontend is not on the same machine as the server.

Development and HuggingFace

VariableRequiredDefaultDescription
DISABLE_SSL_VERIFYNofalseDisable SSL certificate verification. Not recommended for production.
HF_ENDPOINTNoHuggingFace mirror endpoint URL. Useful in regions where huggingface.co is restricted.
HF_HOMENoHuggingFace model cache directory. Mount this path in Docker to reuse downloads across restarts.
HF_HUB_OFFLINENoSet to 1 to force offline mode. Requires models already present in HF_HOME.

config/main.yaml

config/main.yaml controls system behavior, paths, tool settings, and module-specific parameters. This file is read at startup and does not require a restart to take effect during development.

System

system:
  language: en   # "en" or "zh" — controls prompts and UI language

Paths

All paths are relative to the project root.
paths:
  user_data_dir: ./data/user
  knowledge_bases_dir: ./data/knowledge_bases
  user_log_dir: ./data/user/logs
  performance_log_dir: ./data/user/performance
  guide_output_dir: ./data/user/guide
  question_output_dir: ./data/user/question
  research_output_dir: ./data/user/research/cache
  research_reports_dir: ./data/user/research/reports
  solve_output_dir: ./data/user/solve

Tools

tools.rag_tool.kb_base_dir
string
default:"./data/knowledge_bases"
Root directory for all knowledge base data.
tools.rag_tool.default_kb
string
default:"ai_textbook"
Default knowledge base name used when no KB is explicitly selected.
tools.run_code.workspace
string
default:"./data/user/run_code_workspace"
Sandbox directory for code execution artifacts.
tools.web_search.enabled
boolean
default:"true"
Global switch for web search across all modules. Setting this to false disables web search everywhere, regardless of module-level settings.
tools.web_search.provider
string
default:"jina"
Active search provider. Can also be set via SEARCH_PROVIDER in .env. The .env value takes precedence.
tools.query_item.enabled
boolean
default:"true"
Enable the entity lookup tool (queries the knowledge graph for structured items like definitions and theorems).
tools.query_item.max_results
number
default:"5"
Maximum number of items returned per query_item call.

Logging

logging:
  level: DEBUG          # DEBUG, INFO, WARNING, or ERROR
  save_to_file: true    # Write logs to data/user/logs/
  console_output: true  # Print logs to terminal

Question module

question:
  rag_query_count: 3         # Number of RAG queries used to retrieve background knowledge
  max_parallel_questions: 1  # Parallel question generation workers
  rag_mode: naive            # RAG mode for retrieval: "naive" or "hybrid"
  agents:
    retrieve:
      top_k: 30              # Number of chunks retrieved per RAG query
    generate:
      max_retries: 2         # Retry attempts per question on generation failure
    relevance_analyzer:
      enabled: true          # Run post-generation relevance analysis

Solve module

solve:
  max_solve_correction_iterations: 3  # Max self-correction cycles in the solve loop
  enable_citations: true              # Include source citations in the final answer
  save_intermediate_results: true     # Persist intermediate agent outputs to disk
  agents:
    investigate_agent:
      max_actions_per_round: 1  # Tool calls per investigation round
      max_iterations: 3         # Analysis loop iteration cap
    precision_answer_agent:
      enabled: true             # Enable final precision answer formatting step

Research module

The research module has the most configuration options due to its multi-phase pipeline.
research:
  planning:
    rephrase:
      enabled: true       # Run topic rephrasing before decomposition
      max_iterations: 3   # Max rephrasing interaction rounds
    decompose:
      enabled: true
      mode: auto          # "manual" (fixed count) or "auto" (agent decides)
      initial_subtopics: 5
      auto_max_subtopics: 8

  researching:
    max_iterations: 5          # Research iterations per topic block
    execution_mode: series     # "series" or "parallel"
    max_parallel_topics: 1     # Concurrent topics when execution_mode is "parallel"
    new_topic_min_score: 0.85  # Minimum relevance score for dynamically discovered topics
    enable_rag_naive: true
    enable_rag_hybrid: true
    enable_paper_search: true
    enable_web_search: true    # Also governed by tools.web_search.enabled
    enable_run_code: true
    tool_timeout: 60           # Seconds before a tool call times out
    tool_max_retries: 2
    paper_search_years_limit: 3  # Limit paper search to the past N years

  reporting:
    min_section_length: 800      # Minimum character count per report section
    enable_citation_list: true   # Append a references section to the report
    enable_inline_citations: false  # Insert clickable [N] links inline in the report text

  rag:
    kb_name: DE-all       # Default knowledge base for research RAG
    default_mode: hybrid  # "hybrid" or "naive"
    fallback_mode: naive

  queue:
    max_length: 5  # Maximum topics in the dynamic research queue at once

Research presets

Presets are applied on top of the base research configuration. Select a preset from the UI or CLI with --preset.
PresetSubtopicsMax iterationsMin section lengthDescription
quick11300 charsFast overview with minimal depth.
medium54500 charsBalanced depth for most topics.
deep87800 charsThorough research with maximum coverage.
autoUp to 8Up to 6500 charsAgent decides depth within configured limits.

config/agents.yaml

agents.yaml is the single source of truth for LLM temperature and max_tokens across all agent modules. Every agent in a given module shares the same settings.
Do not hardcode temperature or max_tokens in agent code. Always modify this file to change LLM behavior.

Module parameters

Moduletemperaturemax_tokensAgents included
solve0.38192InvestigateAgent, NoteAgent, ManagerAgent, SolveAgent, ToolAgent, ResponseAgent, PrecisionAnswerAgent
research0.512000RephraseAgent, DecomposeAgent, ManagerAgent, ResearchAgent, NoteAgent, ReportingAgent
question0.74096QuestionGenerationAgent, QuestionValidationAgent
guide0.516192LocateAgent, InteractiveAgent, ChatAgent, SummaryAgent
ideagen0.74096MaterialOrganizerAgent, IdeaGenerationWorkflow
co_writer0.74096EditAgent
narrator0.74000NarratorAgent (TTS script generation — limited by TTS character constraints)

Example: reducing research verbosity

To make research reports more concise, lower max_tokens and reduce temperature:
# config/agents.yaml
research:
  temperature: 0.3
  max_tokens: 8000

Example: more creative question generation

To generate more varied questions, raise the temperature for the question module:
# config/agents.yaml
question:
  temperature: 0.9
  max_tokens: 4096

Configuration hierarchy

When the same setting exists at multiple levels, the higher-priority source wins:
  1. Environment variables (.env) — highest priority. Override everything.
  2. config/agents.yaml — LLM parameters only.
  3. config/main.yaml — all other system and module settings.
Keep API keys and model names in .env. Keep behavioral tuning (temperatures, iteration limits, tool switches) in the YAML files. This separation makes it safe to commit your YAML files to version control without exposing secrets.

Build docs developers (and LLMs) love