Skip to main content
If you cannot find a solution here, search the GitHub issues or ask in the Discord community.

LLM API key errors

Symptoms: The backend starts but immediately returns 401 Unauthorized, Invalid API key, or AuthenticationError when you try to use any feature. Causes and fixes
  1. Wrong key: Verify the key by running a test request directly against your provider’s API outside of DeepTutor.
  2. Incorrect LLM_HOST: Every provider has a different base URL. Make sure LLM_HOST matches the provider set in LLM_BINDING.
    ProviderLLM_HOST
    OpenAIhttps://api.openai.com/v1
    Azure OpenAIhttps://<resource>.openai.azure.com/openai/deployments/<deployment>
    DeepSeekhttps://api.deepseek.com/v1
    Anthropichttps://api.anthropic.com/v1
    Ollama (local)http://localhost:11434/v1
  3. LLM_BINDING mismatch: The LLM_BINDING value must match the provider you are using (e.g., openai, anthropic, deepseek, ollama). An incorrect binding causes the client to send requests in the wrong format.
  4. Key in wrong variable: Confirm the key is in LLM_API_KEY, not EMBEDDING_API_KEY or another variable.
Diagnostic command
# Print the LLM config lines from your .env (does not print the key itself)
grep -E "^LLM_" .env

Embedding model errors

Symptoms: Knowledge base creation fails, or RAG retrieval returns empty results. You may see ConnectionError, EmbeddingError, or similar errors in the terminal. Common causes
  • EMBEDDING_HOST points to the wrong endpoint for your provider.
  • EMBEDDING_DIMENSION does not match the actual output dimension of your model. For example, text-embedding-3-small outputs 1536 dimensions by default, while text-embedding-3-large outputs 3072.
  • You are using an OpenAI-compatible local model (e.g., via Ollama or LM Studio) but EMBEDDING_BINDING is set to openai instead of ollama or lm_studio.
Fix Double-check all four embedding variables:
EMBEDDING_BINDING=openai
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_API_KEY=sk-xxx
EMBEDDING_HOST=https://api.openai.com/v1
EMBEDDING_DIMENSION=1536   # must match the model's actual output size
If EMBEDDING_DIMENSION is wrong, the vector store will either reject insertions or silently store incorrect embeddings, causing poor retrieval quality. You must delete and recreate any knowledge bases built with incorrect dimensions.

Knowledge base creation failures

Document parsing errors

Symptoms: Uploading a PDF shows an error or the processing stalls in the terminal. Possible causes and fixes
  • Corrupted or password-protected PDF: DeepTutor cannot parse encrypted PDFs. Remove the password with a PDF tool before uploading.
  • Scanned PDF without OCR text layer: The default parser relies on extractable text. Use a PDF that has been OCR-processed, or enable the Docling parser (available in v0.5.2+) which handles scanned documents.
  • MinerU model not downloaded: Question generation’s mimic mode and some parsing pipelines use MinerU, which downloads models from HuggingFace on first use. If you are behind a firewall or running in an offline environment, set HF_ENDPOINT to a mirror or HF_HUB_OFFLINE=1 with a pre-downloaded cache.
    # Use a HuggingFace mirror
    HF_ENDPOINT=https://your-hf-mirror.example.com
    
    # Or force offline mode (models must already be in HF_HOME)
    HF_HUB_OFFLINE=1
    HF_HOME=/app/data/hf
    

File size and count limits

Very large documents (hundreds of pages) or batches of many files can cause the embedding pipeline to time out or exhaust memory. As a workaround:
  • Split large PDFs into smaller sections before uploading.
  • Upload documents in smaller batches rather than all at once.
  • Monitor memory usage in the terminal while processing runs.

uvloop.Loop error during numbered items extraction

Symptom:
ValueError: Can't patch loop of type <class 'uvloop.Loop'>
Cause: Uvicorn uses uvloop by default, which is incompatible with nest_asyncio. Fix: Run the extraction script directly instead of going through the API:
# Recommended: use the shell script
./scripts/extract_numbered_items.sh <kb_name>

# Alternative: call the Python module directly
python src/knowledge/extract_numbered_items.py \
  --kb <kb_name> \
  --base-dir ./data/knowledge_bases

Docker volume permissions

Symptoms: The container starts but fails to write to data/ with a PermissionError or Operation not permitted. Cause: The data/ directory on the host may be owned by root or another user, while the process inside the container runs as a different UID. Fix
# Make the data directory writable by the container process
chmod -R 777 data/

# Or change ownership to match the container's user (UID 1000 is common)
sudo chown -R 1000:1000 data/
If you are using Docker Compose, you can also set the user in docker-compose.yml:
services:
  deeptutor:
    user: "1000:1000"

Memory and resource issues during document processing

Symptoms: The process is killed (OOM), the terminal shows MemoryError, or the system becomes unresponsive while building a large knowledge base. Recommendations
  • Building a knowledge base is the most memory-intensive operation. Ensure the host has at least 4 GB of free RAM for typical workloads.
  • Reduce the number of documents processed at the same time.
  • If running in Docker, increase the container’s memory limit:
    docker run --memory="4g" ...
    
  • For very large corpora, run the knowledge base builder via the CLI to avoid the overhead of the web server:
    python -m src.knowledge.start_kb init <kb_name> --docs <pdf_path>
    

Search provider not working

Symptoms: Web search returns no results, or the research and solver modules skip web search silently. Common causes and fixes
  1. SEARCH_API_KEY is not set: Web search is disabled unless you provide a key for your chosen provider.
    SEARCH_PROVIDER=perplexity
    SEARCH_API_KEY=pplx-xxx
    
  2. Wrong provider selected: The supported values for SEARCH_PROVIDER are perplexity, tavily, serper, jina, and exa. An unrecognized value disables web search.
  3. Global web search switch is off: The tools.web_search.enabled flag in config/main.yaml is a global override. If it is set to false, web search is disabled regardless of your API key.
    # config/main.yaml
    tools:
      web_search:
        enabled: true
    
  4. Firewall blocking outbound requests: Verify that the backend can reach the search provider’s API from the host machine.
You can test your search key outside DeepTutor with a simple curl request to the provider’s API to confirm it is valid and the network is reachable.

Chinese language support

DeepTutor has full Chinese language support as of v0.6.0. If the UI or outputs are not appearing in Chinese:
  1. Set the system language in config/main.yaml:
    system:
      language: zh
    
  2. Ensure your LLM supports Chinese output. Most general-purpose models (GPT-4o, DeepSeek, Claude) do. Smaller or quantized local models may produce lower-quality Chinese text.
  3. Restart the backend after changing the language setting.
The language setting in config/main.yaml controls the language of agent-generated content (prompts, reports, answers). The UI language is determined by your browser’s locale settings.

Node.js version compatibility

Symptom: The frontend fails to build with syntax errors or missing module errors. Cause: DeepTutor’s frontend requires Node.js 18 or higher. Older versions do not support the required JavaScript APIs. Diagnostic
node --version   # must be v18.x.x or higher
npm --version
Fix
conda install -c conda-forge "nodejs>=18"
After upgrading Node.js, reinstall the frontend dependencies:
npm install --prefix web

Still stuck?

  • Browse or open an issue on GitHub.
  • Ask the community on Discord.
  • Check data/user/logs/ for detailed backend logs that may contain the root cause.

Build docs developers (and LLMs) love