Quickstart

Prerequisites

Before you begin, ensure you have:

Python 3.10+ installed
An OpenAI API key (get one here)
FFmpeg and ImageMagick installed on your system

For detailed installation instructions for all platforms, see the Installation Guide.

Quick Setup

Clone the Repository

git clone https://github.com/SamurAIGPT/AI-Youtube-Shorts-Generator.git
cd AI-Youtube-Shorts-Generator

Install System Dependencies

sudo apt install -y ffmpeg libavdevice-dev libavfilter-dev \
  libopus-dev libvpx-dev pkg-config libsrtp2-dev imagemagick

# Fix ImageMagick security policy (required for subtitles)
sudo sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml

Create Python Virtual Environment

python3.10 -m venv venv
source venv/bin/activate  # On Windows: .\venv\Scripts\Activate.ps1

Install Python Dependencies

pip install -r requirements.txt

If you don’t have an NVIDIA GPU, use requirements-cpu.txt instead, but install CPU PyTorch first:

pip install torch --index-url https://download.pytorch.org/whl/cpu
pip install -r requirements-cpu.txt

Configure OpenAI API Key

Create a .env file in the project root:

OPENAI_API=your_openai_api_key_here

Generate Your First Short

Interactive Mode

Run the tool and enter a YouTube URL when prompted:

./run.sh

You’ll see:

Session ID: a1b2c3d4
Enter YouTube video URL or local video file path:

Paste a YouTube URL (e.g., https://youtu.be/dQw4w9WgXcQ) and press Enter.

Command-Line Mode

Pass the YouTube URL directly as an argument:

./run.sh "https://youtu.be/dQw4w9WgXcQ"

Using Local Video Files

Process a local video file instead:

./run.sh "/path/to/your/video.mp4"

The Processing Workflow

Resolution Selection

You’ll see available video streams:

Available video streams:
  0. Resolution: 1080p, Size: 45.2 MB, Type: Adaptive
  1. Resolution: 720p, Size: 28.1 MB, Type: Adaptive
  2. Resolution: 480p, Size: 15.3 MB, Type: Adaptive

Select resolution number (0-2) or wait 5s for auto-select...
Auto-selecting highest quality in 5 seconds...

Enter a number to select immediately
Wait 5 seconds to auto-select highest quality

Transcription & AI Analysis

The tool will:

Extract and transcribe audio (~30s for 5-minute video with GPU)
Analyze transcript with GPT-4o-mini to find engaging segments

Analyzing transcription to find best highlight...

Review & Approve Selection

The AI will present its selection:

============================================================
SELECTED SEGMENT DETAILS:
Time: 68s - 187s (119s duration)
============================================================

Options:
  [Enter/y] Approve and continue
  [r] Regenerate selection
  [n] Cancel

Auto-approving in 15 seconds if no input...

Your options:

Press Enter or y to approve
Press r to regenerate (can repeat multiple times)
Press n to cancel
Wait 15 seconds to auto-approve

Video Processing

The tool will:

Step 1/4: Extracting clip from original video...
Step 2/4: Cropping to vertical format (9:16)...
Step 3/4: Adding subtitles to video...
Step 4/4: Adding audio to final video...

Expected Output

When processing completes, you’ll see:

============================================================
✓ SUCCESS: my-awesome-video_a1b2c3d4_short.mp4 is ready!
============================================================

Cleaned up temporary files for session a1b2c3d4

Your vertical short will be saved in the project directory with:

9:16 aspect ratio (perfect for TikTok/Reels/Shorts)
Stylized subtitles with Franklin Gothic font
Smart cropping (face-centered or motion-tracked)
Original audio from selected segment

Automation for Batch Processing

Skip interactive prompts for hands-free operation:

./run.sh --auto-approve "https://youtu.be/VIDEO_ID"

Process multiple videos from a file:

# Create urls.txt with one URL per line
echo "https://youtu.be/VIDEO1" > urls.txt
echo "https://youtu.be/VIDEO2" >> urls.txt
echo "https://youtu.be/VIDEO3" >> urls.txt

# Process all videos sequentially
xargs -a urls.txt -I{} ./run.sh --auto-approve {}

The --auto-approve flag automatically:

Selects highest video quality
Approves AI highlight selection after 15s timeout
Perfect for overnight batch processing

Concurrent Processing

Run multiple instances simultaneously (each gets a unique session ID):

./run.sh "https://youtu.be/VIDEO1" &
./run.sh "https://youtu.be/VIDEO2" &
./run.sh "/path/to/video3.mp4" &

Each instance operates independently with separate temporary files.

Troubleshooting Common Issues

No subtitles appearing?Ensure ImageMagick policy allows file operations:

grep 'pattern="@\*"' /etc/ImageMagick-6/policy.xml
# Should show: rights="read|write"

If not, run:

sudo sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml

CUDA/GPU errors?The run.sh script automatically configures CUDA library paths. If you still encounter issues:

export LD_LIBRARY_PATH=$(find $(pwd)/venv/lib/python3.10/site-packages/nvidia -name "lib" -type d | paste -sd ":" -)

Or switch to CPU-only mode (see Installation Guide).

What’s Next?

Installation Guide

Detailed setup instructions for all platforms including Docker

Configuration

Customize subtitle styling, AI prompts, and video quality settings

Get Started

User Guides

Features

Advanced

Prerequisites

Quick Setup

Generate Your First Short

Interactive Mode

Command-Line Mode

Using Local Video Files

The Processing Workflow

Expected Output

Automation for Batch Processing

Concurrent Processing

Troubleshooting Common Issues

What’s Next?

Installation Guide

Configuration

Build docs developers (and LLMs) love

Get Started

User Guides

Features

Advanced

​Prerequisites

​Quick Setup

​Generate Your First Short

​Interactive Mode

​Command-Line Mode

​Using Local Video Files

​The Processing Workflow

​Expected Output

​Automation for Batch Processing

​Concurrent Processing

​Troubleshooting Common Issues

​What’s Next?

Installation Guide

Configuration

Build docs developers (and LLMs) love

Prerequisites

Quick Setup

Generate Your First Short

Interactive Mode

Command-Line Mode

Using Local Video Files

The Processing Workflow

Expected Output

Automation for Batch Processing

Concurrent Processing

Troubleshooting Common Issues

What’s Next?