Skip to main content

Overview

The Content Update API allows you to index documents, manage data sources, and keep your Khoj knowledge base synchronized with your files.

Update Index

Force a reindex of your content.
cURL
curl "https://app.khoj.dev/api/update?force=true" \
  -H "Authorization: Bearer YOUR_API_TOKEN"
t
string
Content type to update. Options: all, org, markdown, pdf, plaintext, image, docx, github, notion
force
boolean
default:"false"
Force complete regeneration of the index
Response
{
  "status": "ok",
  "message": "khoj reloaded"
}

Upload Files

Upload and index files directly via the API.

Upload Single/Multiple Files

cURL
curl -X PUT https://app.khoj.dev/api/content \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -F "[email protected]" \
  -F "[email protected]" \
  -F "t=all"
Python
import requests

headers = {"Authorization": "Bearer YOUR_API_TOKEN"}
files = [
    ('files', ('document.pdf', open('document.pdf', 'rb'), 'application/pdf')),
    ('files', ('notes.md', open('notes.md', 'rb'), 'text/markdown'))
]
params = {'t': 'all'}

response = requests.put(
    "https://app.khoj.dev/api/content",
    headers=headers,
    files=files,
    params=params
)
t
string
default:"all"
Content type for uploaded files
files
file[]
required
Files to upload (multipart/form-data)
Response: Comma-separated list of indexed filenames

Update Existing Files

Use PATCH instead of PUT to update existing files without replacing the entire index:
curl -X PATCH https://app.khoj.dev/api/content \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -F "[email protected]"

Supported File Types

  • Documents: Markdown (.md), Org-mode (.org), Plain text (.txt)
  • Office: PDF (.pdf), Word (.docx)
  • Images: PNG, JPEG, WebP
  • Code: Most text-based formats

Manage Files

List All Files

Get a list of all indexed files.
curl "https://app.khoj.dev/api/content/files?page=0" \
  -H "Authorization: Bearer YOUR_API_TOKEN"
page
integer
default:"0"
Page number for pagination (10 files per page)
truncated
boolean
default:"true"
If true, returns only first 1000 characters of each file’s content
Response
{
  "files": [
    {
      "file_name": "notes/meeting-2024-03-05.md",
      "raw_text": "# Meeting Notes\n\nAttendees: ...",
      "updated_at": "2024-03-05 15:30:00"
    }
  ],
  "num_pages": 3
}

Get Single File

Retrieve a specific file by name.
curl "https://app.khoj.dev/api/content/file?file_name=notes/meeting.md" \
  -H "Authorization: Bearer YOUR_API_TOKEN"
file_name
string
required
Full path of the file to retrieve
Response
{
  "id": 123,
  "file_name": "notes/meeting.md",
  "raw_text": "Full file content here..."
}

Delete Files

Delete a single file from the index.
curl -X DELETE "https://app.khoj.dev/api/content/file?filename=old-notes.md" \
  -H "Authorization: Bearer YOUR_API_TOKEN"
Delete multiple files:
curl -X DELETE "https://app.khoj.dev/api/content/files" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "files": ["old-notes.md", "temp.txt"]
  }'
files
string[]
required
Array of filenames to delete

Content Types

Get Available Content Types

List content types currently indexed for the user.
curl "https://app.khoj.dev/api/content/types" \
  -H "Authorization: Bearer YOUR_API_TOKEN"
Response
["all", "markdown", "pdf", "org", "github"]

Delete Content Type

Remove all content of a specific type.
curl -X DELETE "https://app.khoj.dev/api/content/type/pdf" \
  -H "Authorization: Bearer YOUR_API_TOKEN"
content_type
string
required
Type of content to delete (e.g., “pdf”, “markdown”, “all”)

Content Sources

Get Files by Source

List files from a specific source.
curl "https://app.khoj.dev/api/content/computer" \
  -H "Authorization: Bearer YOUR_API_TOKEN"
Available sources:
  • computer - Local files uploaded via API or desktop app
  • github - Files from connected GitHub repositories
  • notion - Pages from connected Notion workspace

Delete Content Source

Remove all content from a specific source.
curl -X DELETE "https://app.khoj.dev/api/content/source/github" \
  -H "Authorization: Bearer YOUR_API_TOKEN"

External Integrations

GitHub Integration

Get GitHub Configuration

curl "https://app.khoj.dev/api/content/github" \
  -H "Authorization: Bearer YOUR_API_TOKEN"

Set GitHub Configuration

Connect GitHub repositories to Khoj.
curl -X POST "https://app.khoj.dev/api/content/github" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "pat_token": "ghp_your_github_token",
    "repos": [
      {
        "name": "my-repo",
        "owner": "username",
        "branch": "main"
      }
    ]
  }'
pat_token
string
required
GitHub Personal Access Token with repo read permissions
repos
array
required
Array of repository configurations

Notion Integration

Get Notion Configuration

curl "https://app.khoj.dev/api/content/notion" \
  -H "Authorization: Bearer YOUR_API_TOKEN"

Set Notion Configuration

Connect your Notion workspace.
curl -X POST "https://app.khoj.dev/api/content/notion" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "token": "secret_notion_integration_token"
  }'
token
string
required
Notion Integration Token (Internal Integration)
After setting the token, Khoj will automatically sync your Notion pages in the background.

Document Conversion

Convert documents to text for processing.
curl -X POST "https://app.khoj.dev/api/content/convert" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -F "[email protected]" \
  -F "[email protected]"
Response
[
  {
    "name": "document.pdf",
    "content": "Page 1 of document.pdf:\n\nExtracted text content...",
    "file_type": "pdf",
    "size": 15234
  }
]
Supported formats: PDF, DOCX, Markdown, Org-mode, Plain text
Maximum file size is 10MB per file for conversion.

Index Size

Check your current index size.
curl "https://app.khoj.dev/api/content/size" \
  -H "Authorization: Bearer YOUR_API_TOKEN"
Response
{
  "indexed_data_size_in_mb": 45
}

Storage Limits

  • Free tier: 50 MB indexed content
  • Premium tier: 500 MB indexed content
These limits apply to the processed/indexed data, not the raw file sizes.

Best Practices

Regular Updates

Call /api/update regularly to keep your index fresh, especially if files change externally.

Batch Uploads

Upload multiple files in a single request for better performance.

Use PATCH for Updates

Use PATCH instead of PUT when updating existing files to avoid reindexing everything.

Monitor Index Size

Check your index size regularly to stay within limits.

Error Handling

422 Unprocessable Entity

{
  "detail": "Audio size larger than 10Mb limit"
}
File exceeds size limits. Break into smaller files or compress.

500 Internal Server Error

{
  "detail": "Failed to update server indexed content"
}
Indexing failed. Check file format and try again.

Next Steps

Search API

Search your indexed content

Chat API

Chat with your documents

Build docs developers (and LLMs) love