Prerequisites
Before you begin, ensure you have:- Python 3.10+ installed
- An OpenAI API key (get one here)
- FFmpeg and ImageMagick installed on your system
For detailed installation instructions for all platforms, see the Installation Guide.
Quick Setup
Generate Your First Short
Interactive Mode
Run the tool and enter a YouTube URL when prompted:https://youtu.be/dQw4w9WgXcQ) and press Enter.
Command-Line Mode
Pass the YouTube URL directly as an argument:Using Local Video Files
Process a local video file instead:The Processing Workflow
Resolution Selection
You’ll see available video streams:
- Enter a number to select immediately
- Wait 5 seconds to auto-select highest quality
Transcription & AI Analysis
The tool will:
- Extract and transcribe audio (~30s for 5-minute video with GPU)
- Analyze transcript with GPT-4o-mini to find engaging segments
Review & Approve Selection
The AI will present its selection:Your options:
- Press Enter or y to approve
- Press r to regenerate (can repeat multiple times)
- Press n to cancel
- Wait 15 seconds to auto-approve
Expected Output
When processing completes, you’ll see:- 9:16 aspect ratio (perfect for TikTok/Reels/Shorts)
- Stylized subtitles with Franklin Gothic font
- Smart cropping (face-centered or motion-tracked)
- Original audio from selected segment
Automation for Batch Processing
Skip interactive prompts for hands-free operation:Concurrent Processing
Run multiple instances simultaneously (each gets a unique session ID):Troubleshooting Common Issues
What’s Next?
Installation Guide
Detailed setup instructions for all platforms including Docker
Configuration
Customize subtitle styling, AI prompts, and video quality settings
