Overview
Platzi Viewer includes several Python utility scripts for cache management, data processing, and deployment. All scripts are located in the project root directory and require the virtual environment to be activated.Core Scripts
rebuild_cache_drive.py
Purpose: Scans the Google Drive folder structure and buildscourses_cache.json with Drive file IDs for all courses, modules, and classes.
Usage:
- Parses
PlatziRoutes.mdto get the expected course structure - Lists all course folders from the Drive root (ID:
17kPqqPSheDtQ5S1HM6Qvvh2qJ7O3YADm) - Matches course names using fuzzy matching (handles variations, accents, special characters)
- Scans each course folder for modules and class files
- Stores Drive file IDs for videos, summaries, subtitles, readings, and resources
- Generates
courses_cache.json(~20MB, ~20,000 classes)
- Resume capability: Saves progress to
drive_scan_progress.json- can be interrupted and resumed - Rate limiting: Throttles API calls to stay within Google Drive API quotas (12,000 queries/minute)
- Retry logic: Automatically retries failed API calls with exponential backoff (up to 5 attempts)
- Incremental saves: Saves progress every 10 courses scanned
parse_routes.py
Purpose: ParsesPlatziRoutes.md into structured categories, routes, and courses.
Usage:
- Reads
PlatziRoutes.md(Markdown file with course catalog) - Extracts categories (schools) with icons
- Parses routes (learning paths) and courses
- Sanitizes folder names to match Drive naming conventions
- Returns structured JSON with hierarchy
This script is called by
rebuild_cache_drive.py to get the expected course structure before scanning Drive.server.py
Purpose: HTTP server that serves the frontend and acts as a proxy to Google Drive API. Usage:- Multi-threaded: Uses
ThreadingHTTPServerfor concurrent requests - Drive proxy: All files served via
/drive/files/{fileId}endpoint - Range requests: Supports HTTP 206 Partial Content for video seeking
- Progress sync: Saves/loads user progress to
progress.json - Cache reload: Automatically detects changes to
courses_cache.json - Health check:
/api/healthendpoint for diagnostics
| Endpoint | Method | Description |
|---|---|---|
/api/courses | GET | Full course cache with all class details |
/api/bootstrap | GET | Lightweight cache (module summaries only) |
/api/cache-meta | GET | Cache metadata (source, timestamp, stats) |
/api/course-detail/{cat}/{route}/{course} | GET | Detailed course data by reference |
/api/progress | GET | Load user progress |
/api/progress | POST | Save user progress |
/api/health | GET | Server health, Drive status, FFmpeg availability |
/api/refresh | GET | Reload cache (localhost only) |
/api/self-check-drive | GET | Validate all file IDs are Drive references |
/api/video-compatible/{fileId} | GET | FFmpeg-processed video for A/V sync issues |
/drive/files/{fileId} | GET | Stream file from Drive (supports Range headers) |
drive_service.py
Purpose: Google Drive API v3 wrapper with authentication and streaming support. Usage:- Service account auth: Automatic authentication using
service_account.json - Thread-safe: Uses thread-local storage for API client
- Retry logic: Exponential backoff for transient failures (up to 5 retries)
- Streaming: Returns response objects for efficient chunked downloads
- Validation: Drive ID format validation (regex:
^[A-Za-z0-9_-]{10,}$)
- Looks for
GOOGLE_SERVICE_ACCOUNT_JSONenvironment variable (inline JSON) - Falls back to
GOOGLE_SERVICE_ACCOUNT_FILEpath - Searches in: PyInstaller bundle, current directory, script directory
- Refreshes credentials if expired
- Creates thread-safe
AuthorizedSessionfor HTTP requests
Utility Scripts
check_remaining.py
Purpose: ComparesDriveCourses.md and PlatziRoutes.md to find missing courses.
Usage:
desktop_app.py
Purpose: Desktop application entry point using PyQt6 or pywebview. Usage:- GPU acceleration: Configures Chromium flags for hardware video decoding
- Auto-port selection: Finds free port if default is occupied
- Persistent storage: Saves localStorage/cookies to
PlatziData/folder - Embedded server: Launches
server.pyin background thread - Multiple backends: PyQt6 WebEngine or pywebview
check_drive_runtime.py
Purpose: Diagnostic tool to verify Drive service initialization. Usage:- Service account file existence and validity
- Credentials loading and authentication
- Drive API connectivity
- File listing capabilities
Build Scripts (PowerShell)
build_portable_exe.ps1
Purpose: Creates standalone Windows executable using PyInstaller. Usage:dist/PlatziViewer/PlatziViewer.exe
What’s included:
- All Python scripts and dependencies
- Frontend files (HTML, CSS, JS)
- Icon (
favicon.ico) - Optionally:
service_account.json,courses_cache.json
build_desktop_exe.ps1
Purpose: Creates desktop application with embedded browser window. Usage:dist/PlatziViewerDesktop.exe
Difference from portable:
- Uses
desktop_app.pyas entry point - Includes PyQt6 WebEngine dependencies
- Opens as native application window (not browser tab)
- GPU-accelerated video playback
Script Dependencies
All scripts require:google-api-python-client- Drive API clientgoogle-auth- Service account authenticationgoogle-auth-httplib2- HTTP transport
pyinstaller- Executable packagingPyQt6+PyQt6-WebEngine- Desktop UIpywebview- Alternative lightweight UI
Common Tasks
Rebuild cache after adding courses
Update PlatziRoutes.md and refresh
Test Drive connectivity
Build for distribution
Force cache reload in running server
Troubleshooting
”Drive service not available”
Check:service_account.jsonexists and is valid- Drive folder is shared with service account email
GOOGLE_SERVICE_ACCOUNT_FILEenvironment variable (if used)
Cache rebuild fails midway
Solution: Just run again - it will resume fromdrive_scan_progress.json
”Rate limit exceeded”
Solution: The script has built-in throttling. Wait 60 seconds and resume.FFmpeg not found for video compatibility
Solution:Script Architecture
Best Practice: Run
rebuild_cache_drive.py weekly to sync new courses added to Drive.