Docker Setup - AI YouTube Shorts Generator

Docker Installation

The project includes Docker and Docker Compose configurations for containerized execution with NVIDIA GPU support.

Docker setup requires NVIDIA Container Toolkit for GPU-accelerated Whisper transcription. CPU-only Docker support is available but significantly slower.

Prerequisites

Install Docker

Ensure Docker is installed and running:

docker --version

If not installed, follow the official Docker installation guide.

Install Docker Compose

Docker Compose v2 is included with Docker Desktop. For Linux:

docker compose version

If needed, install from Docker Compose docs.

Install NVIDIA Container Toolkit (GPU only)

For CUDA-accelerated transcription:

Ubuntu/Debian

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Verify GPU access:

docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi

Create .env file

Create a .env file in the project root with your OpenAI API key:

.env

OPENAI_API=your_openai_api_key_here

Dockerfile Configuration

The project uses an NVIDIA CUDA base image for GPU support.

Base Image

Dockerfile:1-2

FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04

Provides:

CUDA 12.1.0 runtime
cuDNN 8 for deep learning
Ubuntu 22.04 base system

System Dependencies

Dockerfile:10-24

RUN apt-get update && apt-get install -y \
    python3.10 \
    python3.10-venv \
    python3-pip \
    ffmpeg \
    libavdevice-dev \
    libavfilter-dev \
    libopus-dev \
    libvpx-dev \
    pkg-config \
    libsrtp2-dev \
    imagemagick \
    git \
    wget \
    && rm -rf /var/lib/apt/lists/*

FFmpeg
ImageMagick
Audio Libraries
Python 3.10

Video processing, audio extraction, and format conversion.

libopus-dev, libvpx-dev, libsrtp2-dev for audio codec support.

ImageMagick Policy Fix

Critical for subtitle rendering:

Dockerfile:27

RUN sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml

Without this fix, ImageMagick will refuse to write temporary files, causing subtitle generation to fail.

CUDA Library Path

Dockerfile:45

ENV LD_LIBRARY_PATH=/usr/local/lib/python3.10/dist-packages/nvidia/cudnn/lib:/usr/local/lib/python3.10/dist-packages/nvidia/cublas/lib:$LD_LIBRARY_PATH

Ensures Whisper can find NVIDIA CUDA libraries for GPU acceleration.

Docker Compose Configuration

The docker-compose.yml file defines the service with GPU support and volume mounts.

GPU Configuration

docker-compose.yml:10-16

deploy:
  resources:
    reservations:
      devices:
        - driver: nvidia
          count: 1
          capabilities: [gpu]

GPU driver

driver: nvidia specifies NVIDIA GPU access

GPU count

count: 1 allocates one GPU (change for multi-GPU setups)

Capabilities

capabilities: [gpu] enables GPU compute access

Environment Variables

docker-compose.yml:19-21

environment:
  - NVIDIA_VISIBLE_DEVICES=all
  - NVIDIA_DRIVER_CAPABILITIES=compute,utility

NVIDIA_VISIBLE_DEVICES

Controls which GPUs are visible to the container:

all: All GPUs available
0: Only GPU 0
0,1: GPUs 0 and 1
none: No GPU access (CPU-only)

NVIDIA_DRIVER_CAPABILITIES

Defines GPU capabilities:

compute: CUDA compute operations
utility: nvidia-smi and monitoring tools
graphics: Graphics rendering (not needed here)
video: Video encode/decode (not needed here)

Volume Mounts

docker-compose.yml:28-31

volumes:
  - ./videos:/app/videos          # Input videos
  - ./output:/app/output          # Output directory
  - ./.env:/app/.env:ro           # OpenAI API key (read-only)

./videos:/app/videos
./output:/app/output
./.env:/app/.env:ro

Purpose: YouTube downloads and local video inputHost path: ./videos (create if doesn’t exist)Container path: /app/videosUsage:

# Place local videos here
cp ~/Downloads/my-video.mp4 ./videos/

# Container downloads go here
docker compose run youtube-shorts-generator ./run.sh "https://youtu.be/ID"

Purpose: Final short video outputsHost path: ./output (created automatically)Container path: /app/outputOutput naming: {title}_{session-id}_short.mp4Access:

ls -lh ./output/

Purpose: OpenAI API key configurationHost path: ./.envContainer path: /app/.envRead-only: :ro flag prevents container from modifying the fileAlternative: Use env_file (already configured on line 24-25)

Volumes persist data between container runs. Downloaded videos remain in ./videos/ for reuse.

Interactive Mode

docker-compose.yml:34-35

stdin_open: true
tty: true

Enables interactive input for:

YouTube URL prompts
Resolution selection
Approval workflow

Equivalent to docker run -it.

Building the Image

Build the Docker image before first use:

docker compose build

Build process:

Downloads NVIDIA CUDA base image (~2GB)
Installs system dependencies
Fixes ImageMagick policy
Installs Python packages from requirements.txt
Copies application code
Sets up CUDA library paths

Build time: 5-10 minutes (depending on network speed)

The image is ~6-8GB due to CUDA runtime and dependencies. Ensure sufficient disk space.

Running with Docker Compose

Interactive Mode

Run with prompts for URL and approval:

docker compose up

You’ll see:

Session ID: 3f8a9b12
Enter YouTube video URL or local video file path:

Use docker compose up for interactive mode, not docker compose up -d (detached mode won’t show prompts).

Command-Line Mode

Process a specific video without interaction:

docker compose run youtube-shorts-generator ./run.sh "https://youtu.be/VIDEO_ID"

Auto-Approve Mode

Fully automated processing:

docker compose run youtube-shorts-generator ./run.sh --auto-approve "https://youtu.be/VIDEO_ID"

Local File Processing

Process videos from the mounted ./videos directory:

# Copy video to mounted directory
cp ~/my-video.mp4 ./videos/

# Process inside container
docker compose run youtube-shorts-generator ./run.sh "/app/videos/my-video.mp4"

Running with Docker CLI

Alternative to Docker Compose for more control:

Basic Run

docker run --rm \
  --gpus all \
  -v $(pwd)/.env:/app/.env:ro \
  -v $(pwd)/videos:/app/videos \
  -v $(pwd)/output:/app/output \
  -it \
  ai-youtube-shorts-generator

Flag Explanations

Flag	Purpose
`--rm`	Remove container after exit
`--gpus all`	Enable all GPUs
`-v $(pwd)/.env:/app/.env:ro`	Mount API key (read-only)
`-v $(pwd)/videos:/app/videos`	Mount videos directory
`-v $(pwd)/output:/app/output`	Mount output directory
`-it`	Interactive mode with TTY
`ai-youtube-shorts-generator`	Image name

With Command-Line Arguments

docker run --rm \
  --gpus all \
  -v $(pwd)/.env:/app/.env:ro \
  -v $(pwd)/videos:/app/videos \
  -v $(pwd)/output:/app/output \
  ai-youtube-shorts-generator \
  ./run.sh --auto-approve "https://youtu.be/VIDEO_ID"

CPU-Only Mode

Run without GPU (significantly slower transcription):

docker run --rm \
  -v $(pwd)/.env:/app/.env:ro \
  -v $(pwd)/videos:/app/videos \
  -v $(pwd)/output:/app/output \
  -it \
  ai-youtube-shorts-generator

CPU-only transcription can take 10-20x longer than GPU-accelerated processing.

Batch Processing with Docker

Sequential Processing

while IFS= read -r url; do
  docker compose run youtube-shorts-generator ./run.sh --auto-approve "$url"
done < urls.txt

Parallel Processing

cat urls.txt | xargs -P 3 -I{} \
  docker compose run youtube-shorts-generator ./run.sh --auto-approve "{}"

Running multiple Docker containers in parallel may cause GPU memory issues. Limit parallelism based on available VRAM:

8GB GPU: 2-3 containers max
16GB GPU: 4-5 containers max
24GB+ GPU: 6+ containers

Troubleshooting

GPU Not Detected

Symptom:

Could not load dynamic library 'libcudnn.so.8'

Solutions:

Verify NVIDIA drivers

nvidia-smi

Ensure drivers are installed on host.

Check Docker GPU access

docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi

Verify NVIDIA Container Toolkit works.

Restart Docker daemon

sudo systemctl restart docker

Check docker-compose.yml GPU config

Ensure GPU reservation is correctly configured:

deploy:
  resources:
    reservations:
      devices:
        - driver: nvidia
          count: 1
          capabilities: [gpu]

ImageMagick Policy Error

Symptom:

ImageMagick security policy blocks '@' pattern

Solution: Rebuild image to apply the policy fix:

docker compose build --no-cache

The Dockerfile includes the fix on line 27:

RUN sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml

Volume Permission Issues

Symptom:

Permission denied: '/app/output/video_short.mp4'

Solution: Ensure host directories have correct permissions:

mkdir -p videos output
chmod 777 videos output

Or run container with user mapping:

docker-compose.yml

user: "${UID}:${GID}"

Then:

UID=$(id -u) GID=$(id -g) docker compose run youtube-shorts-generator

Out of Memory (OOM)

Symptom:

CUDA out of memory

Solutions:

Reduce parallelism
Lower video resolution
Increase GPU memory
CPU-only mode

Run fewer concurrent containers:

# Instead of -P 5
cat urls.txt | xargs -P 2 -I{} docker compose run youtube-shorts-generator ./run.sh --auto-approve "{}"

Remove GPU access for CPU-based transcription:

docker run --rm \
  -v $(pwd)/.env:/app/.env:ro \
  -v $(pwd)/videos:/app/videos \
  -v $(pwd)/output:/app/output \
  -it \
  ai-youtube-shorts-generator

Container Exits Immediately

Symptom:

docker compose up
Exited with code 1

Check logs:

docker compose logs

Common issues:

Missing .env file → Create .env with OPENAI_API=your_key
Invalid API key → Verify key at https://platform.openai.com/api-keys
Missing volumes → Ensure videos/ and output/ directories exist

Performance Optimization

Docker Build Cache

Speed up rebuilds by leveraging layer caching:

# requirements.txt copied separately for caching
COPY requirements.txt .
RUN pip3 install --no-cache-dir -r requirements.txt

# Application code copied last (changes frequently)
COPY . .

Changing Python code won’t invalidate the pip install layer.

Shared Volume for Downloads

Reuse downloaded videos across runs:

# Download once
docker compose run youtube-shorts-generator ./run.sh "https://youtu.be/VIDEO_ID"

# Videos persist in ./videos/
ls -lh ./videos/

# Process again without re-downloading
docker compose run youtube-shorts-generator ./run.sh "/app/videos/video_file.mp4"

Pre-built Image

Build once, run many times:

# Build and tag
docker build -t ai-shorts:v1.0 .

# Run without rebuilding
docker run --rm --gpus all \
  -v $(pwd)/.env:/app/.env:ro \
  -v $(pwd)/videos:/app/videos \
  -v $(pwd)/output:/app/output \
  ai-shorts:v1.0 \
  ./run.sh --auto-approve "https://youtu.be/VIDEO_ID"

For production deployments, consider pushing the image to Docker Hub or a private registry for faster distribution.

Get Started

User Guides

Features

Advanced

​Docker Installation

​Prerequisites

​Dockerfile Configuration

​Base Image

​System Dependencies

​ImageMagick Policy Fix

​CUDA Library Path

​Docker Compose Configuration

​GPU Configuration

​Environment Variables

​Volume Mounts

​Interactive Mode

​Building the Image

​Running with Docker Compose

​Interactive Mode

​Command-Line Mode

​Auto-Approve Mode

​Local File Processing

​Running with Docker CLI

​Basic Run

​With Command-Line Arguments

​CPU-Only Mode

​Batch Processing with Docker

​Sequential Processing

​Parallel Processing

​Troubleshooting

​GPU Not Detected

​ImageMagick Policy Error

​Volume Permission Issues

​Out of Memory (OOM)

​Container Exits Immediately

​Performance Optimization

​Docker Build Cache

​Shared Volume for Downloads

​Pre-built Image

Build docs developers (and LLMs) love

Docker Installation

Prerequisites

Dockerfile Configuration

Base Image

System Dependencies

ImageMagick Policy Fix

CUDA Library Path

Docker Compose Configuration

GPU Configuration

Environment Variables

Volume Mounts

Interactive Mode

Building the Image

Running with Docker Compose

Interactive Mode

Command-Line Mode

Auto-Approve Mode

Local File Processing

Running with Docker CLI

Basic Run

With Command-Line Arguments

CPU-Only Mode

Batch Processing with Docker

Sequential Processing

Parallel Processing

Troubleshooting

GPU Not Detected

ImageMagick Policy Error

Volume Permission Issues

Out of Memory (OOM)

Container Exits Immediately

Performance Optimization

Docker Build Cache

Shared Volume for Downloads

Pre-built Image