Skip to main content

Overview

The subtitle system automatically generates styled captions from transcription segments and burns them directly into the video. Subtitles are synchronized with audio timing and positioned for optimal mobile viewing.

How It Works

The add_subtitles_to_video function in Components/Subtitles.py:4 processes transcription segments and overlays them on the video:
1

Transcription Filtering

Filters transcription segments to match the video timeframe (important for cropped clips).
2

Text Clip Creation

Creates styled TextClip objects for each transcription segment with timing information.
3

Video Composition

Composites all text clips onto the base video using MoviePy.
4

Rendering

Renders the final video with burned-in subtitles and audio.

Function Signature

Components/Subtitles.py
def add_subtitles_to_video(input_video, output_video, transcriptions, video_start_time=0):
    """
    Add subtitles to video based on transcription segments.
    
    Args:
        input_video: Path to input video file
        output_video: Path to output video file
        transcriptions: List of [text, start, end] from transcribeAudio
        video_start_time: Start time offset if video was cropped (default: 0)
    """
input_video
string
required
Path to the input video file (typically the cropped video).
output_video
string
required
Path where the final video with subtitles will be saved.
transcriptions
list
required
List of [text, start, end] tuples from transcribeAudio function. Each tuple contains the text content and start/end timestamps in seconds.
video_start_time
float
default:"0"
Start time offset in seconds. Used when the video has been cropped to adjust subtitle timing.

Transcription Filtering

Subtitles are filtered to match the video duration:
Components/Subtitles.py:17-28
relevant_transcriptions = []
for text, start, end in transcriptions:
    # Adjust times relative to video start
    adjusted_start = start - video_start_time
    adjusted_end = end - video_start_time
    
    # Only include if within video duration
    if adjusted_end > 0 and adjusted_start < video_duration:
        adjusted_start = max(0, adjusted_start)
        adjusted_end = min(video_duration, adjusted_end)
        relevant_transcriptions.append([text.strip(), adjusted_start, adjusted_end])
Time Adjustment: When a 2-minute clip is extracted from a longer video, video_start_time ensures subtitle timing matches the cropped segment.

Subtitle Styling

Font Configuration

Subtitles use the Franklin Gothic font family:
Components/Subtitles.py:56
font='Franklin-Gothic'
To use a different font, change the font parameter to any system font. List available fonts with:
convert -list font | grep -i "font:"

Dynamic Font Sizing

Font size scales proportionally to video height:
Components/Subtitles.py:39-41
# Scale font size proportionally to video height (~6.5% of height)
# 1080p → 70px, 720p → 47px
dynamic_fontsize = int(video.h * 0.065)
Video HeightFont SizeCalculation
1080p70px1080 × 0.065
720p47px720 × 0.065
480p31px480 × 0.065

Color and Stroke

The subtitle style uses bright blue text with black outline:
Components/Subtitles.py:50-56
txt_clip = TextClip(
    text,
    fontsize=dynamic_fontsize,
    color='#2699ff',
    stroke_color='black',
    stroke_width=2,
    font='Franklin-Gothic',
    method='caption',
    size=(video.w - 100, None)  # Leave 50px margin on each side
)
color
string
default:"#2699ff"
Text fill color in hex format. #2699ff is a bright blue that stands out on most backgrounds.
stroke_color
string
default:"black"
Outline color around the text. Black provides maximum contrast.
stroke_width
int
default:"2"
Outline thickness in pixels. 2px provides good readability without being too bold.
method
string
default:"caption"
Text rendering method. caption enables word wrapping for long text.

Text Wrapping

Subtitles automatically wrap to fit the video width:
Components/Subtitles.py:58
size=(video.w - 100, None)  # Leave 50px margin on each side
The 100-pixel reduction (50px margin on each side) prevents subtitle text from touching the screen edges on mobile devices.

Positioning

Subtitles are positioned at the bottom center of the video:
Components/Subtitles.py:61-62
# Position at bottom center
txt_clip = txt_clip.set_position(('center', video.h - txt_clip.h - 100))
horizontal
string
default:"center"
Horizontal alignment. center ensures text is centered regardless of width.
vertical
int
default:"video.h - txt_clip.h - 100"
Vertical position in pixels. Positioned 100px from the bottom of the video.
The 100px bottom margin prevents subtitles from being obscured by mobile UI elements or platform watermarks.

Timing Synchronization

Each subtitle clip is precisely timed to match the transcription:
Components/Subtitles.py:63-64
txt_clip = txt_clip.set_start(start)
txt_clip = txt_clip.set_duration(end - start)
set_start
float
Start time in seconds when the subtitle should appear.
set_duration
float
Duration in seconds how long the subtitle should remain visible.

Example Timing

# Transcription segment: ["Hello world", 5.2, 7.8]
txt_clip.set_start(5.2)      # Appears at 5.2 seconds
txt_clip.set_duration(2.6)   # Visible for 2.6 seconds (7.8 - 5.2)

Video Composition

All subtitle clips are composited onto the base video:
Components/Subtitles.py:69-70
print(f"Adding {len(text_clips)} subtitle segments to video...")
final_video = CompositeVideoClip([video] + text_clips)
Layering Order: The base video is the first layer, with all text clips rendered on top in order.

Output Encoding

The final video is encoded with high-quality settings:
Components/Subtitles.py:73-80
final_video.write_videofile(
    output_video,
    codec='libx264',
    audio_codec='aac',
    fps=video.fps,
    preset='medium',
    bitrate='3000k'
)
codec
string
default:"libx264"
H.264 video codec for wide compatibility across platforms.
audio_codec
string
default:"aac"
AAC audio codec (industry standard for mobile platforms).
fps
float
default:"video.fps"
Matches source video frame rate to prevent timing issues.
preset
string
default:"medium"
Encoding speed preset. Options: ultrafast, fast, medium, slow, veryslow. Slower presets provide better compression.
bitrate
string
default:"3000k"
Target video bitrate. 3000k (3 Mbps) provides high quality for 1080p vertical video.

Error Handling

No Transcriptions

If no transcriptions match the video timeframe:
Components/Subtitles.py:30-34
if not relevant_transcriptions:
    print("No transcriptions found for this video segment")
    video.write_videofile(output_video, codec='libx264', audio_codec='aac')
    video.close()
    return
The video is still output but without subtitles.

Empty Text Handling

Components/Subtitles.py:45-47
text = text.strip()
if not text:
    continue
Empty transcription segments are automatically skipped.

Customizing Subtitle Style

Edit Components/Subtitles.py:52-56:
txt_clip = TextClip(
    text,
    fontsize=80,  # Fixed size instead of dynamic
    font='Arial-Bold',  # Different font
    # ... other parameters
)
Edit Components/Subtitles.py:53-55:
color='white',           # White text
stroke_color='#FF0000',  # Red outline
stroke_width=3,          # Thicker outline
Edit Components/Subtitles.py:62:
# Top-centered subtitles
txt_clip = txt_clip.set_position(('center', 100))

# Bottom-left subtitles
txt_clip = txt_clip.set_position((50, video.h - txt_clip.h - 50))

# Custom position (x, y)
txt_clip = txt_clip.set_position((100, 500))
Edit Components/Subtitles.py:58,62:
# Wider margins (150px total)
size=(video.w - 150, None)

# More bottom spacing (200px from bottom)
txt_clip = txt_clip.set_position(('center', video.h - txt_clip.h - 200))

Performance

  • Processing Speed: ~0.5-2× realtime (depends on video length and subtitle count)
  • Memory Usage: ~500MB-2GB (depends on video resolution and duration)
  • Rendering Time:
    • 1-minute 1080p video: ~30-60 seconds
    • 2-minute 720p video: ~20-40 seconds
ImageMagick Dependency: Subtitle rendering requires ImageMagick. On Linux, you may need to adjust the security policy:
sudo sed -i 's/rights="none" pattern="@\*"/rights="read|write" pattern="@*"/' \
  /etc/ImageMagick-6/policy.xml

Output Example

Adding 23 subtitle segments to video...
Moviepy - Building video final_video_with_subtitles.mp4
Moviepy - Writing video final_video_with_subtitles.mp4
Moviepy - Done !
✓ Subtitles added successfully -> final_video_with_subtitles.mp4

Build docs developers (and LLMs) love