Docker Installation
The project includes Docker and Docker Compose configurations for containerized execution with NVIDIA GPU support.
Docker setup requires NVIDIA Container Toolkit for GPU-accelerated Whisper transcription. CPU-only Docker support is available but significantly slower.
Prerequisites
Install Docker Compose
Docker Compose v2 is included with Docker Desktop. For Linux: If needed, install from Docker Compose docs .
Install NVIDIA Container Toolkit (GPU only)
For CUDA-accelerated transcription: curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Verify GPU access: docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
Create .env file
Create a .env file in the project root with your OpenAI API key: OPENAI_API = your_openai_api_key_here
Dockerfile Configuration
The project uses an NVIDIA CUDA base image for GPU support.
Base Image
FROM nvidia/cuda:12.1.0-cudnn8-runtime-ubuntu22.04
Provides:
CUDA 12.1.0 runtime
cuDNN 8 for deep learning
Ubuntu 22.04 base system
System Dependencies
RUN apt-get update && apt-get install -y \
python3.10 \
python3.10-venv \
python3-pip \
ffmpeg \
libavdevice-dev \
libavfilter-dev \
libopus-dev \
libvpx-dev \
pkg-config \
libsrtp2-dev \
imagemagick \
git \
wget \
&& rm -rf /var/lib/apt/lists/*
FFmpeg
ImageMagick
Audio Libraries
Python 3.10
Video processing, audio extraction, and format conversion.
Subtitle rendering and text overlay generation.
libopus-dev, libvpx-dev, libsrtp2-dev for audio codec support.
Required Python version with virtual environment support.
ImageMagick Policy Fix
Critical for subtitle rendering:
RUN sed -i 's/rights="none" pattern="@ \* "/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml
Without this fix, ImageMagick will refuse to write temporary files, causing subtitle generation to fail.
CUDA Library Path
ENV LD_LIBRARY_PATH=/usr/local/lib/python3.10/dist-packages/nvidia/cudnn/lib:/usr/local/lib/python3.10/dist-packages/nvidia/cublas/lib:$LD_LIBRARY_PATH
Ensures Whisper can find NVIDIA CUDA libraries for GPU acceleration.
Docker Compose Configuration
The docker-compose.yml file defines the service with GPU support and volume mounts.
GPU Configuration
deploy :
resources :
reservations :
devices :
- driver : nvidia
count : 1
capabilities : [ gpu ]
GPU driver
driver: nvidia specifies NVIDIA GPU access
GPU count
count: 1 allocates one GPU (change for multi-GPU setups)
Capabilities
capabilities: [gpu] enables GPU compute access
Environment Variables
environment :
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
Controls which GPUs are visible to the container:
all: All GPUs available
0: Only GPU 0
0,1: GPUs 0 and 1
none: No GPU access (CPU-only)
NVIDIA_DRIVER_CAPABILITIES
Defines GPU capabilities:
compute: CUDA compute operations
utility: nvidia-smi and monitoring tools
graphics: Graphics rendering (not needed here)
video: Video encode/decode (not needed here)
Volume Mounts
volumes :
- ./videos:/app/videos # Input videos
- ./output:/app/output # Output directory
- ./.env:/app/.env:ro # OpenAI API key (read-only)
./videos:/app/videos
./output:/app/output
./.env:/app/.env:ro
Purpose : YouTube downloads and local video inputHost path : ./videos (create if doesn’t exist)Container path : /app/videosUsage :# Place local videos here
cp ~/Downloads/my-video.mp4 ./videos/
# Container downloads go here
docker compose run youtube-shorts-generator ./run.sh "https://youtu.be/ID"
Purpose : Final short video outputsHost path : ./output (created automatically)Container path : /app/outputOutput naming : {title}_{session-id}_short.mp4Access :Purpose : OpenAI API key configurationHost path : ./.envContainer path : /app/.envRead-only : :ro flag prevents container from modifying the fileAlternative : Use env_file (already configured on line 24-25)
Volumes persist data between container runs. Downloaded videos remain in ./videos/ for reuse.
Interactive Mode
stdin_open : true
tty : true
Enables interactive input for:
YouTube URL prompts
Resolution selection
Approval workflow
Equivalent to docker run -it.
Building the Image
Build the Docker image before first use:
Build process:
Downloads NVIDIA CUDA base image (~2GB)
Installs system dependencies
Fixes ImageMagick policy
Installs Python packages from requirements.txt
Copies application code
Sets up CUDA library paths
Build time: 5-10 minutes (depending on network speed)
The image is ~6-8GB due to CUDA runtime and dependencies. Ensure sufficient disk space.
Running with Docker Compose
Interactive Mode
Run with prompts for URL and approval:
You’ll see:
Session ID: 3f8a9b12
Enter YouTube video URL or local video file path:
Use docker compose up for interactive mode, not docker compose up -d (detached mode won’t show prompts).
Command-Line Mode
Process a specific video without interaction:
docker compose run youtube-shorts-generator ./run.sh "https://youtu.be/VIDEO_ID"
Auto-Approve Mode
Fully automated processing:
docker compose run youtube-shorts-generator ./run.sh --auto-approve "https://youtu.be/VIDEO_ID"
Local File Processing
Process videos from the mounted ./videos directory:
# Copy video to mounted directory
cp ~/my-video.mp4 ./videos/
# Process inside container
docker compose run youtube-shorts-generator ./run.sh "/app/videos/my-video.mp4"
Running with Docker CLI
Alternative to Docker Compose for more control:
Basic Run
docker run --rm \
--gpus all \
-v $( pwd ) /.env:/app/.env:ro \
-v $( pwd ) /videos:/app/videos \
-v $( pwd ) /output:/app/output \
-it \
ai-youtube-shorts-generator
Flag Purpose --rmRemove container after exit --gpus allEnable all GPUs -v $(pwd)/.env:/app/.env:roMount API key (read-only) -v $(pwd)/videos:/app/videosMount videos directory -v $(pwd)/output:/app/outputMount output directory -itInteractive mode with TTY ai-youtube-shorts-generatorImage name
With Command-Line Arguments
docker run --rm \
--gpus all \
-v $( pwd ) /.env:/app/.env:ro \
-v $( pwd ) /videos:/app/videos \
-v $( pwd ) /output:/app/output \
ai-youtube-shorts-generator \
./run.sh --auto-approve "https://youtu.be/VIDEO_ID"
CPU-Only Mode
Run without GPU (significantly slower transcription):
docker run --rm \
-v $( pwd ) /.env:/app/.env:ro \
-v $( pwd ) /videos:/app/videos \
-v $( pwd ) /output:/app/output \
-it \
ai-youtube-shorts-generator
CPU-only transcription can take 10-20x longer than GPU-accelerated processing.
Batch Processing with Docker
Sequential Processing
while IFS = read -r url ; do
docker compose run youtube-shorts-generator ./run.sh --auto-approve " $url "
done < urls.txt
Parallel Processing
cat urls.txt | xargs -P 3 -I {} \
docker compose run youtube-shorts-generator ./run.sh --auto-approve "{}"
Running multiple Docker containers in parallel may cause GPU memory issues. Limit parallelism based on available VRAM:
8GB GPU: 2-3 containers max
16GB GPU: 4-5 containers max
24GB+ GPU: 6+ containers
Troubleshooting
GPU Not Detected
Symptom:
Could not load dynamic library 'libcudnn.so.8'
Solutions:
Verify NVIDIA drivers
Ensure drivers are installed on host.
Check Docker GPU access
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi
Verify NVIDIA Container Toolkit works.
Restart Docker daemon
sudo systemctl restart docker
Check docker-compose.yml GPU config
Ensure GPU reservation is correctly configured: deploy :
resources :
reservations :
devices :
- driver : nvidia
count : 1
capabilities : [ gpu ]
ImageMagick Policy Error
Symptom:
ImageMagick security policy blocks '@' pattern
Solution:
Rebuild image to apply the policy fix:
docker compose build --no-cache
The Dockerfile includes the fix on line 27:
RUN sed -i 's/rights="none" pattern="@ \* "/rights="read|write" pattern="@*"/' /etc/ImageMagick-6/policy.xml
Volume Permission Issues
Symptom:
Permission denied: '/app/output/video_short.mp4'
Solution:
Ensure host directories have correct permissions:
mkdir -p videos output
chmod 777 videos output
Or run container with user mapping:
Then:
UID = $( id -u ) GID = $( id -g ) docker compose run youtube-shorts-generator
Out of Memory (OOM)
Symptom:
Solutions:
Reduce parallelism
Lower video resolution
Increase GPU memory
CPU-only mode
Run fewer concurrent containers: # Instead of -P 5
cat urls.txt | xargs -P 2 -I {} docker compose run youtube-shorts-generator ./run.sh --auto-approve "{}"
Select lower resolution during YouTube download (480p instead of 1080p).
Use a GPU with more VRAM or reduce other GPU workloads.
Remove GPU access for CPU-based transcription: docker run --rm \
-v $( pwd ) /.env:/app/.env:ro \
-v $( pwd ) /videos:/app/videos \
-v $( pwd ) /output:/app/output \
-it \
ai-youtube-shorts-generator
Symptom:
docker compose up
Exited with code 1
Check logs:
Common issues:
Missing .env file → Create .env with OPENAI_API=your_key
Invalid API key → Verify key at https://platform.openai.com/api-keys
Missing volumes → Ensure videos/ and output/ directories exist
Docker Build Cache
Speed up rebuilds by leveraging layer caching:
# requirements.txt copied separately for caching
COPY requirements.txt .
RUN pip3 install --no-cache-dir -r requirements.txt
# Application code copied last (changes frequently)
COPY . .
Changing Python code won’t invalidate the pip install layer.
Shared Volume for Downloads
Reuse downloaded videos across runs:
# Download once
docker compose run youtube-shorts-generator ./run.sh "https://youtu.be/VIDEO_ID"
# Videos persist in ./videos/
ls -lh ./videos/
# Process again without re-downloading
docker compose run youtube-shorts-generator ./run.sh "/app/videos/video_file.mp4"
Pre-built Image
Build once, run many times:
# Build and tag
docker build -t ai-shorts:v1.0 .
# Run without rebuilding
docker run --rm --gpus all \
-v $( pwd ) /.env:/app/.env:ro \
-v $( pwd ) /videos:/app/videos \
-v $( pwd ) /output:/app/output \
ai-shorts:v1.0 \
./run.sh --auto-approve "https://youtu.be/VIDEO_ID"
For production deployments, consider pushing the image to Docker Hub or a private registry for faster distribution.