Skip to main content

Docker Installation

The project includes Docker and Docker Compose configurations for containerized execution with NVIDIA GPU support.
Docker setup requires NVIDIA Container Toolkit for GPU-accelerated Whisper transcription. CPU-only Docker support is available but significantly slower.

Prerequisites

1

Install Docker

Ensure Docker is installed and running:
docker --version
If not installed, follow the official Docker installation guide.
2

Install Docker Compose

Docker Compose v2 is included with Docker Desktop. For Linux:
docker compose version
If needed, install from Docker Compose docs.
3

Install NVIDIA Container Toolkit (GPU only)

For CUDA-accelerated transcription:
Ubuntu/Debian
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Verify GPU access:
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
4

Create .env file

Create a .env file in the project root with your OpenAI API key:
.env
OPENAI_API=your_openai_api_key_here

Dockerfile Configuration

The project uses an NVIDIA CUDA base image for GPU support.

Base Image

Dockerfile:1-2
FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04
Provides:
  • CUDA 12.1.0 runtime
  • cuDNN 8 for deep learning
  • Ubuntu 22.04 base system

System Dependencies

Dockerfile:10-24
RUN apt-get update && apt-get install -y \
    python3.10 \
    python3.10-venv \
    python3-pip \
    ffmpeg \
    libavdevice-dev \
    libavfilter-dev \
    libopus-dev \
    libvpx-dev \
    pkg-config \
    libsrtp2-dev \
    imagemagick \
    git \
    wget \
    && rm -rf /var/lib/apt/lists/*
Video processing, audio extraction, and format conversion.

ImageMagick Policy Fix

Critical for subtitle rendering:
Dockerfile:27
RUN sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml
Without this fix, ImageMagick will refuse to write temporary files, causing subtitle generation to fail.

CUDA Library Path

Dockerfile:45
ENV LD_LIBRARY_PATH=/usr/local/lib/python3.10/dist-packages/nvidia/cudnn/lib:/usr/local/lib/python3.10/dist-packages/nvidia/cublas/lib:$LD_LIBRARY_PATH
Ensures Whisper can find NVIDIA CUDA libraries for GPU acceleration.

Docker Compose Configuration

The docker-compose.yml file defines the service with GPU support and volume mounts.

GPU Configuration

docker-compose.yml:10-16
deploy:
  resources:
    reservations:
      devices:
        - driver: nvidia
          count: 1
          capabilities: [gpu]
1

GPU driver

driver: nvidia specifies NVIDIA GPU access
2

GPU count

count: 1 allocates one GPU (change for multi-GPU setups)
3

Capabilities

capabilities: [gpu] enables GPU compute access

Environment Variables

docker-compose.yml:19-21
environment:
  - NVIDIA_VISIBLE_DEVICES=all
  - NVIDIA_DRIVER_CAPABILITIES=compute,utility
Controls which GPUs are visible to the container:
  • all: All GPUs available
  • 0: Only GPU 0
  • 0,1: GPUs 0 and 1
  • none: No GPU access (CPU-only)
Defines GPU capabilities:
  • compute: CUDA compute operations
  • utility: nvidia-smi and monitoring tools
  • graphics: Graphics rendering (not needed here)
  • video: Video encode/decode (not needed here)

Volume Mounts

docker-compose.yml:28-31
volumes:
  - ./videos:/app/videos          # Input videos
  - ./output:/app/output          # Output directory
  - ./.env:/app/.env:ro           # OpenAI API key (read-only)
Purpose: YouTube downloads and local video inputHost path: ./videos (create if doesn’t exist)Container path: /app/videosUsage:
# Place local videos here
cp ~/Downloads/my-video.mp4 ./videos/

# Container downloads go here
docker compose run youtube-shorts-generator ./run.sh "https://youtu.be/ID"
Volumes persist data between container runs. Downloaded videos remain in ./videos/ for reuse.

Interactive Mode

docker-compose.yml:34-35
stdin_open: true
tty: true
Enables interactive input for:
  • YouTube URL prompts
  • Resolution selection
  • Approval workflow
Equivalent to docker run -it.

Building the Image

Build the Docker image before first use:
docker compose build
Build process:
  1. Downloads NVIDIA CUDA base image (~2GB)
  2. Installs system dependencies
  3. Fixes ImageMagick policy
  4. Installs Python packages from requirements.txt
  5. Copies application code
  6. Sets up CUDA library paths
Build time: 5-10 minutes (depending on network speed)
The image is ~6-8GB due to CUDA runtime and dependencies. Ensure sufficient disk space.

Running with Docker Compose

Interactive Mode

Run with prompts for URL and approval:
docker compose up
You’ll see:
Session ID: 3f8a9b12
Enter YouTube video URL or local video file path:
Use docker compose up for interactive mode, not docker compose up -d (detached mode won’t show prompts).

Command-Line Mode

Process a specific video without interaction:
docker compose run youtube-shorts-generator ./run.sh "https://youtu.be/VIDEO_ID"

Auto-Approve Mode

Fully automated processing:
docker compose run youtube-shorts-generator ./run.sh --auto-approve "https://youtu.be/VIDEO_ID"

Local File Processing

Process videos from the mounted ./videos directory:
# Copy video to mounted directory
cp ~/my-video.mp4 ./videos/

# Process inside container
docker compose run youtube-shorts-generator ./run.sh "/app/videos/my-video.mp4"

Running with Docker CLI

Alternative to Docker Compose for more control:

Basic Run

docker run --rm \
  --gpus all \
  -v $(pwd)/.env:/app/.env:ro \
  -v $(pwd)/videos:/app/videos \
  -v $(pwd)/output:/app/output \
  -it \
  ai-youtube-shorts-generator
FlagPurpose
--rmRemove container after exit
--gpus allEnable all GPUs
-v $(pwd)/.env:/app/.env:roMount API key (read-only)
-v $(pwd)/videos:/app/videosMount videos directory
-v $(pwd)/output:/app/outputMount output directory
-itInteractive mode with TTY
ai-youtube-shorts-generatorImage name

With Command-Line Arguments

docker run --rm \
  --gpus all \
  -v $(pwd)/.env:/app/.env:ro \
  -v $(pwd)/videos:/app/videos \
  -v $(pwd)/output:/app/output \
  ai-youtube-shorts-generator \
  ./run.sh --auto-approve "https://youtu.be/VIDEO_ID"

CPU-Only Mode

Run without GPU (significantly slower transcription):
docker run --rm \
  -v $(pwd)/.env:/app/.env:ro \
  -v $(pwd)/videos:/app/videos \
  -v $(pwd)/output:/app/output \
  -it \
  ai-youtube-shorts-generator
CPU-only transcription can take 10-20x longer than GPU-accelerated processing.

Batch Processing with Docker

Sequential Processing

while IFS= read -r url; do
  docker compose run youtube-shorts-generator ./run.sh --auto-approve "$url"
done < urls.txt

Parallel Processing

cat urls.txt | xargs -P 3 -I{} \
  docker compose run youtube-shorts-generator ./run.sh --auto-approve "{}"
Running multiple Docker containers in parallel may cause GPU memory issues. Limit parallelism based on available VRAM:
  • 8GB GPU: 2-3 containers max
  • 16GB GPU: 4-5 containers max
  • 24GB+ GPU: 6+ containers

Troubleshooting

GPU Not Detected

Symptom:
Could not load dynamic library 'libcudnn.so.8'
Solutions:
1

Verify NVIDIA drivers

nvidia-smi
Ensure drivers are installed on host.
2

Check Docker GPU access

docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
Verify NVIDIA Container Toolkit works.
3

Restart Docker daemon

sudo systemctl restart docker
4

Check docker-compose.yml GPU config

Ensure GPU reservation is correctly configured:
deploy:
  resources:
    reservations:
      devices:
        - driver: nvidia
          count: 1
          capabilities: [gpu]

ImageMagick Policy Error

Symptom:
ImageMagick security policy blocks '@' pattern
Solution: Rebuild image to apply the policy fix:
docker compose build --no-cache
The Dockerfile includes the fix on line 27:
RUN sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml

Volume Permission Issues

Symptom:
Permission denied: '/app/output/video_short.mp4'
Solution: Ensure host directories have correct permissions:
mkdir -p videos output
chmod 777 videos output
Or run container with user mapping:
docker-compose.yml
user: "${UID}:${GID}"
Then:
UID=$(id -u) GID=$(id -g) docker compose run youtube-shorts-generator

Out of Memory (OOM)

Symptom:
CUDA out of memory
Solutions:
Run fewer concurrent containers:
# Instead of -P 5
cat urls.txt | xargs -P 2 -I{} docker compose run youtube-shorts-generator ./run.sh --auto-approve "{}"

Container Exits Immediately

Symptom:
docker compose up
Exited with code 1
Check logs:
docker compose logs
Common issues:
  • Missing .env file → Create .env with OPENAI_API=your_key
  • Invalid API key → Verify key at https://platform.openai.com/api-keys
  • Missing volumes → Ensure videos/ and output/ directories exist

Performance Optimization

Docker Build Cache

Speed up rebuilds by leveraging layer caching:
# requirements.txt copied separately for caching
COPY requirements.txt .
RUN pip3 install --no-cache-dir -r requirements.txt

# Application code copied last (changes frequently)
COPY . .
Changing Python code won’t invalidate the pip install layer.

Shared Volume for Downloads

Reuse downloaded videos across runs:
# Download once
docker compose run youtube-shorts-generator ./run.sh "https://youtu.be/VIDEO_ID"

# Videos persist in ./videos/
ls -lh ./videos/

# Process again without re-downloading
docker compose run youtube-shorts-generator ./run.sh "/app/videos/video_file.mp4"

Pre-built Image

Build once, run many times:
# Build and tag
docker build -t ai-shorts:v1.0 .

# Run without rebuilding
docker run --rm --gpus all \
  -v $(pwd)/.env:/app/.env:ro \
  -v $(pwd)/videos:/app/videos \
  -v $(pwd)/output:/app/output \
  ai-shorts:v1.0 \
  ./run.sh --auto-approve "https://youtu.be/VIDEO_ID"
For production deployments, consider pushing the image to Docker Hub or a private registry for faster distribution.

Build docs developers (and LLMs) love