Skip to main content
The Face Crop component converts horizontal videos to vertical 9:16 format with intelligent face-centered or motion-tracked cropping.

Functions

crop_to_vertical

Crops a video to vertical 9:16 aspect ratio with intelligent face detection or motion tracking.
crop_to_vertical(input_video_path: str, output_video_path: str) -> None
input_video_path
str
required
Path to the input video file to crop
output_video_path
str
required
Path where the cropped vertical video will be saved

Cropping Modes

The function automatically detects the best cropping strategy:
When: A face is detected in the first 30 framesBehavior:
  • Analyzes first 30 frames to detect faces
  • Uses median face position for stability
  • Applies 60px right offset to prevent cutoff
  • Uses static crop position (no tracking)
  • Best for: Talking head videos, interviews, vlogs
# Face detected - using face-centered crop
avg_face_x = int(sorted(face_positions)[len(face_positions) // 2])
avg_face_x += 60  # Offset to prevent right-side cutoff
x_start = max(0, min(avg_face_x - vertical_width // 2, 
                     original_width - vertical_width))

Output Specifications

  • Aspect Ratio: 9:16 (vertical/portrait)
  • Height: Preserves original video height
  • Width: Calculated as height * 9/16
  • Dimensions: Always even numbers (codec requirement)
  • Codec: XVID (Windows) or mp4v (Linux/Mac)
  • FPS: Preserves original frame rate
from Components.FaceCrop import crop_to_vertical

# Crop video to vertical format
input_video = "horizontal_video.mp4"
output_video = "vertical_video.mp4"

crop_to_vertical(input_video, output_video)

# Output:
# Detecting face position for static crop...
# ✓ Face detected. Using face-centered crop at x=450
# Output dimensions: 608x1080
# Processed 100/3000 frames
# Processed 200/3000 frames
# ...
# Cropping complete. Processed 3000 frames -> vertical_video.mp4

Features

Face Detection

  • Uses OpenCV’s Haar Cascade classifier
  • Samples first 30 frames for stable detection
  • Selects largest face if multiple detected
  • Uses median position to avoid outliers
  • 60px right offset prevents edge cutoff
face_cascade = cv2.CascadeClassifier(
    cv2.data.haarcascades + 'haarcascade_frontalface_default.xml'
)
faces = face_cascade.detectMultiScale(
    gray, 
    scaleFactor=1.1, 
    minNeighbors=8, 
    minSize=(30, 30)
)

Motion Tracking

  • Optical flow using Farneback algorithm
  • Updates every 1 second (once per fps frames)
  • Focuses on significant motion (threshold: 2.0)
  • Smooth tracking with 90/10 interpolation
  • Prevents abrupt camera movements
flow = cv2.calcOpticalFlowFarneback(
    prev_gray, curr_gray, None,
    0.5, 3, 15, 3, 5, 1.2, 0
)
magnitude = np.sqrt(flow[..., 0]**2 + flow[..., 1]**2)

# Smooth tracking
smoothed_x = int(0.90 * smoothed_x + 0.10 * target_x)

Scaling and Letterboxing

For screen recordings:
  • Scales to show 67% of original width
  • Adds letterboxing if needed to maintain aspect ratio
  • Uses LANCZOS4 interpolation for quality
The function sets a global Fps variable used by combine_videos() to ensure frame rate consistency.

combine_videos

Combines audio from one video with visuals from another, used to add original audio to cropped videos.
combine_videos(video_with_audio: str, video_without_audio: str, 
               output_filename: str) -> None
video_with_audio
str
required
Path to video file containing the audio track to use
video_without_audio
str
required
Path to video file containing the visual content to use
output_filename
str
required
Path where the combined video will be saved

Output Specifications

  • Video Codec: libx264 (H.264)
  • Audio Codec: AAC
  • Preset: medium (balanced speed/quality)
  • Bitrate: 3000k
  • FPS: Uses global Fps variable from crop_to_vertical()
from Components.FaceCrop import crop_to_vertical, combine_videos

# Step 1: Crop to vertical (removes audio)
crop_to_vertical("original.mp4", "cropped_no_audio.mp4")

# Step 2: Add original audio back
combine_videos(
    video_with_audio="original.mp4",
    video_without_audio="cropped_no_audio.mp4",
    output_filename="final_with_audio.mp4"
)

# Output:
# Combined video saved successfully as final_with_audio.mp4

Error Handling

try:
    crop_to_vertical(input_path, output_path)
except Exception as e:
    print(f"Cropping failed: {e}")

try:
    combine_videos(audio_source, video_source, output)
except Exception as e:
    print(f"Error combining video and audio: {str(e)}")
Requires OpenCV (cv2), NumPy, and MoviePy. The cropping process can be CPU/GPU intensive for long videos.

Progress Tracking

Both functions provide progress updates:
# crop_to_vertical progress
if frame_count % 100 == 0:
    print(f"Processed {frame_count}/{total_frames} frames")

# combine_videos progress
# MoviePy shows built-in progress bar during write_videofile()

Performance Tips

  1. Face Detection: Only samples first 30 frames for speed
  2. Motion Tracking: Updates once per second (not every frame)
  3. Smooth Tracking: Uses 90/10 interpolation to reduce jitter
  4. Codec Selection: Platform-specific codec for compatibility
  5. Dimension Alignment: Ensures even dimensions for codec compatibility

Platform Compatibility

import platform

if platform.system() == 'Windows':
    fourcc = cv2.VideoWriter_fourcc(*'XVID')
else:
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')
  • Windows: Uses XVID codec
  • Linux/Mac: Uses mp4v codec

Build docs developers (and LLMs) love