Skip to main content

Prerequisites

Before installing, ensure you have:
  • Python 3.10 or higher
  • OpenAI API key (get one here)
  • FFmpeg with development headers
  • ImageMagick for subtitle rendering
  • NVIDIA GPU with CUDA support (optional but recommended for 5-10x faster transcription)
GPU acceleration requires an NVIDIA GPU with CUDA support. For systems without a compatible GPU, see the CPU-Only Installation section.

Ubuntu/Debian Installation

1

Clone the Repository

git clone https://github.com/SamurAIGPT/AI-Youtube-Shorts-Generator.git
cd AI-Youtube-Shorts-Generator
2

Install System Dependencies

sudo apt install -y ffmpeg libavdevice-dev libavfilter-dev \
  libopus-dev libvpx-dev pkg-config libsrtp2-dev imagemagick
These packages provide:
  • ffmpeg: Video/audio processing
  • libavdevice-dev, libavfilter-dev: FFmpeg development libraries
  • libopus-dev, libvpx-dev: Audio/video codec support
  • pkg-config: Build configuration tool
  • libsrtp2-dev: Secure RTP protocol support
  • imagemagick: Subtitle rendering
3

Fix ImageMagick Security Policy

ImageMagick has a restrictive security policy by default that prevents subtitle rendering:
sudo sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml
This step is required on Linux for subtitles to work. Without it, subtitle generation will fail silently.
4

Create Virtual Environment

python3.10 -m venv venv
source venv/bin/activate
5

Install Python Dependencies

pip install -r requirements.txt
This installs key dependencies:
  • faster-whisper (1.0.1): GPU-accelerated speech transcription
  • torch (2.7.1): PyTorch with CUDA support
  • langchain-openai (0.3.0): GPT-4o-mini integration
  • moviepy (1.0.3): Video editing and manipulation
  • opencv-python (4.8.1.78): Face detection and cropping
  • pytubefix (9.1.1): YouTube video downloading
6

Configure Environment Variables

Create a .env file in the project root:
OPENAI_API=your_openai_api_key_here
Replace your_openai_api_key_here with your actual OpenAI API key from platform.openai.com/api-keys.
7

Verify Installation

Test that GPU acceleration is working:
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
Should output: CUDA available: TrueIf it shows False, you may need to install CUDA drivers or use CPU-only mode.

macOS Installation

1

Clone the Repository

git clone https://github.com/SamurAIGPT/AI-Youtube-Shorts-Generator.git
cd AI-Youtube-Shorts-Generator
2

Install Homebrew (if not already installed)

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
3

Install System Dependencies

brew install ffmpeg imagemagick
4

Create Virtual Environment

python3.10 -m venv venv
source venv/bin/activate
5

Install Python Dependencies

pip install -r requirements.txt
macOS does not support CUDA, so transcription will run on CPU. For faster processing, consider using a cloud GPU instance or the AI Clipping API.
6

Configure Environment Variables

Create a .env file:
echo "OPENAI_API=your_openai_api_key_here" > .env

Windows Installation

1

Clone the Repository

git clone https://github.com/SamurAIGPT/AI-Youtube-Shorts-Generator.git
cd AI-Youtube-Shorts-Generator
2

Install FFmpeg

# Run PowerShell as Administrator
choco install ffmpeg -y
3

Install ImageMagick

choco install imagemagick -y
After installation, configure the security policy:
  1. Open C:\Program Files\ImageMagick-7.x.x-Q16-HDRI\config\policy.xml
  2. Find: <policy domain="path" rights="none" pattern="@*"/>
  3. Change to: <policy domain="path" rights="read|write" pattern="@*"/>
  4. Save the file
4

Create Virtual Environment

python -m venv venv
.\venv\Scripts\Activate.ps1
If you get an execution policy error, run:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
5

Install Python Dependencies

pip install -r requirements.txt
6

Configure Environment Variables

Create a .env file in the project root:
echo "OPENAI_API=your_openai_api_key_here" > .env
7

Run the Tool

On Windows, run Python directly (instead of using run.sh):
python main.py "https://youtu.be/VIDEO_ID"
Or for interactive mode:
python main.py

CPU-Only Installation

If you don’t have an NVIDIA GPU, you can run the tool in CPU-only mode. Transcription will be significantly slower (5-10x), but all features remain functional.

Ubuntu/Debian (CPU)

1

Install System Dependencies

sudo apt install -y ffmpeg libavdevice-dev libavfilter-dev \
  libopus-dev libvpx-dev pkg-config libsrtp2-dev imagemagick

sudo sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml
2

Create Virtual Environment

python3.10 -m venv venv
source venv/bin/activate
3

Install CPU PyTorch First

pip install torch --index-url https://download.pytorch.org/whl/cpu
Important: Install CPU PyTorch before installing other dependencies to avoid downloading CUDA packages.
4

Install Other Dependencies

pip install -r requirements-cpu.txt
If requirements-cpu.txt doesn’t exist, use requirements.txt but skip CUDA-related packages.
5

Verify CPU Mode

python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
Should output: CUDA available: False

Windows (CPU)

1

Install System Dependencies

choco install ffmpeg imagemagick -y
Configure ImageMagick policy as described in the Windows Installation section.
2

Create Virtual Environment

python -m venv venv
.\venv\Scripts\Activate.ps1
3

Install CPU PyTorch

pip install torch --index-url https://download.pytorch.org/whl/cpu
4

Install Other Dependencies

pip install -r requirements-cpu.txt

macOS (CPU)

macOS installation is CPU-only by default. Follow the standard macOS installation instructions.
Performance Note: CPU transcription of a 5-minute video may take 2-5 minutes compared to ~30 seconds with GPU acceleration.

Docker Installation

Docker provides a containerized environment with all dependencies pre-configured, including GPU support.

Prerequisites

1

Clone the Repository

git clone https://github.com/SamurAIGPT/AI-Youtube-Shorts-Generator.git
cd AI-Youtube-Shorts-Generator
2

Configure Environment Variables

Create a .env file:
OPENAI_API=your_openai_api_key_here
3

Build and Run Container

docker-compose up --build
The container configuration:
  • Base image: nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04
  • GPU support: Enabled via NVIDIA runtime
  • Mounts: .env file, ./videos (input), ./output (output)
  • Interactive mode: Enabled for URL input
4

Process Videos

Interactive mode:
docker-compose run youtube-shorts-generator ./run.sh
With YouTube URL:
docker-compose run youtube-shorts-generator ./run.sh "https://youtu.be/VIDEO_ID"
With local file:
# Place video in ./videos directory
docker-compose run youtube-shorts-generator ./run.sh "/app/videos/video.mp4"

Manual Docker Build

1

Build Image

docker build -t ai-shorts-generator .
2

Run Container

docker run --gpus all \
  -v $(pwd)/.env:/app/.env \
  -v $(pwd)/videos:/app/videos \
  -v $(pwd)/output:/app/output \
  -it ai-shorts-generator ./run.sh "https://youtu.be/VIDEO_ID"
Flags explained:
  • --gpus all: Enable GPU acceleration
  • -v $(pwd)/.env:/app/.env: Mount environment variables
  • -v $(pwd)/videos:/app/videos: Mount input videos
  • -v $(pwd)/output:/app/output: Mount output directory
  • -it: Interactive terminal for prompts

CPU-Only Docker

Remove GPU-specific configuration:
docker run \
  -v $(pwd)/.env:/app/.env \
  -v $(pwd)/videos:/app/videos \
  -v $(pwd)/output:/app/output \
  -it ai-shorts-generator ./run.sh

Environment Variables

The tool requires the following environment variable:
VariableRequiredDescriptionExample
OPENAI_APIYesOpenAI API key for GPT-4o-minisk-proj-...
Configuration file: .env in project root
OPENAI_API=sk-proj-1234567890abcdefghijklmnopqrstuvwxyz
Security: Never commit your .env file to version control. Add it to .gitignore.

Verifying Your Installation

Test your installation with a short video:
./run.sh "https://youtu.be/dQw4w9WgXcQ"
Successful installation will:
  1. Download the video
  2. Extract and transcribe audio
  3. Analyze transcript with GPT-4o-mini
  4. Present highlight selection for approval
  5. Process and output vertical short

Troubleshooting

CUDA/GPU Issues

Problem: torch.cuda.is_available() returns False Solutions:
  1. Verify NVIDIA drivers are installed:
    nvidia-smi
    
  2. Check CUDA library paths:
    export LD_LIBRARY_PATH=$(find $(pwd)/venv/lib/python3.10/site-packages/nvidia -name "lib" -type d | paste -sd ":" -)
    
    The run.sh script handles this automatically.
  3. Reinstall PyTorch with CUDA:
    pip uninstall torch
    pip install torch --index-url https://download.pytorch.org/whl/cu121
    

ImageMagick Subtitle Issues

Problem: No subtitles appear in output video Solution: Check ImageMagick policy:
grep 'pattern="@\*"' /etc/ImageMagick-6/policy.xml
Should show: rights="read|write" If not:
sudo sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml

Face Detection Issues

Problem: Cropping doesn’t center on faces Causes:
  • Video needs visible faces in first 30 frames
  • Low-resolution videos have less reliable detection
  • For screen recordings, motion tracking applies automatically
Solution: Adjust face detection sensitivity in Components/FaceCrop.py:detectMultiScale:
minNeighbors=8  # Higher = fewer false positives
minSize=(30, 30)  # Minimum face size in pixels

OpenAI API Issues

Problem: ERROR: Failed to get highlight from LLM Causes:
  • Invalid or missing API key
  • Rate limiting
  • Network connectivity issues
  • Insufficient API credits
Solutions:
  1. Verify API key in .env file
  2. Check API usage at platform.openai.com/usage
  3. Test API key:
    curl https://api.openai.com/v1/models \
      -H "Authorization: Bearer $OPENAI_API"
    

FFmpeg Not Found

Problem: ffmpeg: command not found Solutions:
  • Ubuntu/Debian: sudo apt install ffmpeg
  • macOS: brew install ffmpeg
  • Windows: Add FFmpeg to system PATH

Python Version Issues

Problem: Module compatibility errors Solution: Ensure Python 3.10+ is installed:
python --version  # Should show 3.10.x or higher
Install Python 3.10:

Next Steps

Quickstart Guide

Generate your first short in under 5 minutes

Usage Examples

Learn CLI commands and automation techniques

Configuration

Customize subtitle styling, AI prompts, and video settings

API Reference

Explore the codebase and component architecture

Build docs developers (and LLMs) love