Architecture
The web version consists of two services:- Backend: FastAPI server (port 8000) with WebSocket support
- Frontend: Next.js dashboard (port 3000)
Quick start with batch file
Run the startup script
The easiest way to launch the web application on Windows is using the provided batch file:This script will:
- Install Python dependencies from
requirements.txt - Start the backend server on port 8000
- Start the frontend development server on port 3000
- Open your browser to
http://localhost:3000
The batch file opens two separate command windows - one for the backend and one for the frontend. Keep both windows open while using the application.
Manual deployment
If you prefer to start services manually or are not on Windows, follow these steps:Install dependencies
fastapi- Web frameworkuvicorn- ASGI serverwebsockets- Real-time communicationplaywright- Browser automationcrawl4ai- Web scraping
Start the backend server
http://0.0.0.0:8000 and listen for WebSocket connections at /ws.Configuration
Backend configuration
The backend server includes several security features and configuration options:CORS settings
web_server.py:32-38
PDF storage
Generated PDFs are stored in thePDF/ directory by default:
web_server.py:42-44
Rate limiting
The server includes basic DoS protection with a concurrent download limit:web_server.py:73-74
Environment variables
Create a.env file in the root directory for optional configuration:
.env
WebSocket protocol
The frontend communicates with the backend via WebSocket at/ws. Here’s how the protocol works:
Starting a download
Cancelling a download
Server responses
The server sends several types of messages:Port configuration
| Service | Default Port | Configurable |
|---|---|---|
| Backend | 8000 | Yes (modify web_server.py:174) |
| Frontend | 3000 | Yes (Next.js config) |
web_server.py
If you change the backend port, update the WebSocket connection URL in the frontend code accordingly.
Security features
The web server includes multiple security protections:Path traversal prevention
web_server.py:56-59
Rate limiting
web_server.py:103-105
Error sanitization
web_server.py:154-156
Troubleshooting
Backend won’t start
- Port already in use: Another application is using port 8000. Change the port or stop the conflicting application.
- Missing dependencies: Run
pip install -r requirements.txtagain. - Playwright browsers not installed: Run
playwright install chromium.
Frontend can’t connect to backend
- Verify the backend is running on port 8000
- Check CORS configuration includes your frontend URL
- Ensure no firewall is blocking WebSocket connections
Downloads fail silently
- Check the backend terminal for error logs
- Verify the PDF directory is writable
- Ensure Playwright browsers are installed:
playwright install-deps
Production considerations
- Use production ASGI server: Deploy with Gunicorn + Uvicorn workers
- Enable HTTPS: Configure SSL certificates for secure WebSocket connections (WSS)
- Update CORS policy: Restrict
allow_originsto your production domain - Set up reverse proxy: Use Nginx or Caddy to handle static files and WebSocket upgrades
- Implement authentication: Add user authentication to prevent unauthorized access
- Monitor resources: Track CPU/memory usage and adjust
MAX_DOWNLOADSaccordingly - Configure logging: Set up proper logging with rotation and monitoring