Base URL
/api except the root endpoint.
API Version
Current version: 2.0.0 You can check the API version and status at the root endpoint:API Structure
The API is organized into the following sections:Single Document
Process individual PDF files with AI-powered tagging
Batch Processing
Process multiple documents with real-time progress updates
User Management
User registration, login, and token management
History & Jobs
View processing history, job details, and user statistics
Health & Status
System health checks and monitoring endpoints
Authentication
Most endpoints require JWT authentication. See the Authentication page for details on obtaining and using access tokens. Public endpoints (no authentication required):POST /api/auth/register- User registrationPOST /api/auth/login- User loginGET /api/health- Health checkGET /api/status- Status check
- All
/api/single/*endpoints - All
/api/batch/*endpoints GET /api/auth/me- Get current userPOST /api/auth/refresh- Refresh access token
Request Format
All API requests use standard HTTP methods:GET- Retrieve dataPOST- Create or process data- WebSocket - Real-time bidirectional communication
Content Types
The API supports the following content types:- JSON
- Multipart Form Data
Response Format
All API responses are JSON-formatted with consistent structure:Success Response
Error Response
HTTP Status Codes
| Status | Meaning | Description |
|---|---|---|
| 200 | OK | Request successful |
| 400 | Bad Request | Invalid request parameters |
| 401 | Unauthorized | Missing or invalid authentication token |
| 403 | Forbidden | Insufficient permissions |
| 404 | Not Found | Resource not found |
| 500 | Internal Server Error | Server-side error |
Rate Limits
The API does not currently implement rate limiting at the application level. However, your OpenRouter API key will have its own rate limits based on your plan.
- Use batch processing for multiple documents instead of sequential single requests
- Implement exponential backoff for retries
- Monitor your OpenRouter usage dashboard
CORS Configuration
The API is configured with permissive CORS settings for development:Endpoint Categories
Single Document Processing
Process individual PDF files via upload or URL:POST /api/single/process- Process single PDFGET /api/single/preview- Preview PDF from URL
Batch Processing
Process multiple documents with real-time progress:WebSocket /api/batch/ws/{job_id}- Real-time batch processingPOST /api/batch/start- Start a batch jobGET /api/batch/jobs/{job_id}/status- Get job statusPOST /api/batch/jobs/{job_id}/cancel- Cancel jobPOST /api/batch/jobs/{job_id}/pause- Pause jobPOST /api/batch/jobs/{job_id}/resume- Resume jobGET /api/batch/active- List active jobsPOST /api/batch/validate-paths- Validate file pathsGET /api/batch/template- Get CSV templatePOST /api/batch/process- Legacy batch processing
User Management
Manage user accounts and authentication:POST /api/auth/register- Register new userPOST /api/auth/login- User loginPOST /api/auth/refresh- Refresh access tokenPOST /api/auth/logout- Logout userGET /api/auth/me- Get current user
History & Jobs
View processing history and statistics:GET /api/history/jobs- List user’s jobsGET /api/history/jobs/{job_id}- Get job detailsDELETE /api/history/jobs/{job_id}- Delete jobGET /api/history/documents- List recent documentsGET /api/history/documents/{doc_id}- Get document detailsGET /api/history/documents/search- Search documentsGET /api/history/stats- Get user statistics
Health & Monitoring
Check system health and status:GET /api/health- Comprehensive health checkGET /api/status- Simple status check
WebSocket Endpoints
The API supports WebSocket connections for real-time batch processing:External APIs
The system integrates with the following external services:OpenRouter API
Endpoint:https://openrouter.ai/api/v1/chat/completions
Used for AI-powered tag generation. You must provide your own OpenRouter API key.
Get your API key at openrouter.ai/keys
OCR Engines
- Tesseract OCR: Local CLI tool for fast text extraction (Hindi + English)
- EasyOCR: Deep learning OCR for 80+ languages (automatic fallback)
Optional Integrations
- AWS S3: For batch processing from S3 buckets (requires AWS credentials)
- MinIO: Local object storage for file persistence
- PostgreSQL: User data and document history
- Redis: Job state persistence and pub/sub
SDK Examples
While there’s no official SDK, you can easily integrate with the API using standard HTTP clients:API Changelog
Version 2.0.0 (Current)
- Added user authentication with JWT tokens
- Added document history tracking
- Added WebSocket support for real-time batch processing
- Added path validation endpoint
- Added Redis for job state persistence
- Improved error handling and retry logic
Version 1.0.0
- Initial release
- Single document processing
- Legacy batch processing
- OCR support (Tesseract + EasyOCR)
- OpenRouter integration
Next Steps
Authentication Guide
Learn how to authenticate API requests
Process a Document
Start processing PDF documents
Batch Processing
Process multiple documents at once
Job History
View processing history and statistics
User Management
Create and manage user accounts