Skip to main content

Overview

Multi-stream mode allows you to monitor multiple RTSP cameras simultaneously. Each stream:
  • Runs in its own dedicated thread
  • Has independent person detection
  • Saves to a separate output directory
  • Can be viewed together in a grid display
Multi-stream mode is designed for 2-16 cameras. For larger deployments, consider running multiple instances with different camera groups.

Basic Usage

Two Methods for Specifying Streams

Pass URLs directly as command arguments:
python main.py --rtsp-list \
  "rtsp://camera1.local/stream" \
  "rtsp://camera2.local/stream" \
  "rtsp://camera3.local/stream" \
  --save video --display
Use quotes around each URL, especially if they contain special characters.

Command Options

--rtsp-list
list
One or more RTSP stream URLs separated by spaces
--rtsp-file
string
Path to text file containing RTSP URLs (one per line)
--save
choice
required
Save mode: image for snapshots or video for clips
--display
flag
Enable grid display window showing all streams

RTSP URL File Format

Create a text file with one RTSP URL per line:
rtsp_streams.txt
# Office cameras
rtsp://192.168.1.100/stream
rtsp://192.168.1.101/stream

# Warehouse cameras
rtsp://192.168.1.200/stream
rtsp://192.168.1.201/stream
rtsp://192.168.1.202/stream

# This camera is offline, skip it
# rtsp://192.168.1.203/stream

# Parking lot cameras
rtsp://192.168.1.150/stream
rtsp://192.168.1.151/stream
File parsing (main.py:119-124):
try:
    with open(args.rtsp_file, "r") as f:
        rtsp_urls = [line.strip() for line in f if line.strip()
                     and not line.startswith("#")]
    print(f"Loaded {len(rtsp_urls)} RTSP streams from {args.rtsp_file}")
except FileNotFoundError:
    print(f"Error: File {args.rtsp_file} not found")
File format rules:
  • One URL per line
  • Blank lines are ignored
  • Lines starting with # are comments
  • Leading/trailing whitespace is trimmed

Grid Display

When --display is enabled, all streams appear in a single composited window.

Enable Grid Display

python main.py --rtsp-list \
  "rtsp://cam1.local" \
  "rtsp://cam2.local" \
  "rtsp://cam3.local" \
  "rtsp://cam4.local" \
  --save image --display

Grid Layout

Streams are automatically arranged in a grid:
StreamsGrid LayoutExample
11×1Single full window
2-42×2Four quadrants
5-93×3Nine tiles
10-164×4Sixteen tiles
Grid examples:
+-------------+-------------+
|             |             |
|  Stream 1   |  Stream 2   |
|             |             |
+-------------+-------------+
|             |             |
|   (empty)   |   (empty)   |
|             |             |
+-------------+-------------+
Grid features:
  • Each stream shows person count and entry counter
  • Green bounding boxes around detected persons
  • Confidence scores displayed
  • Streams update independently
  • Press ‘q’ to quit

Without Display (Headless)

Omit --display to run without GUI (headless servers):
python main.py --rtsp-file streams.txt --save video
Behavior:
  • No display window
  • Lower resource usage
  • Still processes all streams
  • Saves output files normally
  • Logs to console

Threading Architecture

Each stream processes independently in its own thread.

Thread Creation

From multi_stream_manager.py:71-80:
# Launch one worker thread per stream
threads: List[threading.Thread] = []
for stream_id, rtsp_url in stream_list:
    t = threading.Thread(
        target=self._processor.process_single_stream,
        args=(stream_id, rtsp_url, frame_skip, save_mode, display_manager),
        daemon=True,
    )
    t.start()
    threads.append(t)
    print(f"Started thread for stream {stream_id}: {rtsp_url}")
Key characteristics:

Daemon Threads

Threads are daemon threads - they terminate when main program exits.

Independent Processing

Each stream has its own connection, detection loop, and reconnection logic.

Shared Detector

All threads share a single PersonDetector instance with thread-safe inference.

Separate Outputs

Each stream saves to its own stream_<id>/ directory.

Thread-Safe Detection

The PersonDetector uses a lock to ensure only one thread runs inference at a time. From person_detector.py:228-244:
def detect_persons(self, frame: cv2.typing.MatLike) -> Tuple[bool, int, List[Tuple[int, int, int, int, float]]]:
    """
    Detect persons in frame using available method.
    Thread-safe: acquires an internal lock so that only one thread
    runs inference at a time (OpenCV DNN / HOG are not thread-safe).
    Returns: (has_person: bool, person_count: int, boxes: list)
    """
    with self._inference_lock:
        if self.net is not None:
            boxes = self.detect_persons_yolo(frame)
        else:
            boxes = self.detect_persons_hog(frame)
    
    has_person = len(boxes) > 0
    person_count = len(boxes)
    
    return has_person, person_count, boxes
Why thread-safety matters:
  • OpenCV’s DNN module is not thread-safe
  • Multiple threads calling net.forward() simultaneously causes crashes
  • Lock ensures sequential inference
  • Other operations (frame reading, saving) remain parallel
Performance impact: With many streams, inference becomes a bottleneck. GPU acceleration helps significantly.

Thread Monitoring

The main thread waits for all worker threads: From multi_stream_manager.py:82-94:
try:
    if display:
        print("All streams shown in a single grid window. Press 'q' or Ctrl+C to stop...")
    else:
        print("All streams started. Press Ctrl+C to stop all streams...")
    
    for t in threads:
        while t.is_alive():
            t.join(timeout=0.5)
            if display and display_manager is not None and not display_manager.is_running:
                break
        if display and display_manager is not None and not display_manager.is_running:
            break
Termination triggers:
  1. User presses ‘q’ (closes display window)
  2. User presses Ctrl+C
  3. All streams disconnect and exhaust retries

Output Organization

Multi-stream mode creates a sub-directory for each stream.

Directory Structure

output/
├── stream_1/
│   ├── person_entry_1_20260309_143022_1741528222.jpg
│   ├── person_entry_2_20260309_144510_1741529110.jpg
│   └── person_clip_1_20260309_143022_1741528222.mp4
├── stream_2/
│   ├── person_entry_1_20260309_143155_1741528315.jpg
│   └── person_entry_2_20260309_145230_1741529550.jpg
├── stream_3/
│   └── person_clip_1_20260309_143500_1741528500.mp4
└── stream_4/
    ├── person_entry_1_20260309_143022_1741528222.jpg
    └── person_entry_2_20260309_143420_1741528460.jpg

Stream ID Assignment

Stream IDs are assigned based on input method:
Auto-numbered starting from 1:
python main.py --rtsp-list \
  "rtsp://cam1.local" \    # stream_1
  "rtsp://cam2.local" \    # stream_2
  "rtsp://cam3.local" \    # stream_3
  --save image
From multi_stream_manager.py:56-59:
if isinstance(rtsp_urls, dict):
    stream_list = list(rtsp_urls.items())
else:
    stream_list = list(enumerate(rtsp_urls, 1))  # Start from 1

Directory Creation

From stream_processor.py:55-58:
if save_mode is not None:
    person_dir = f"{self.output_dir}/stream_{stream_id}"
    os.makedirs(person_dir, exist_ok=True)
    print(f"[Stream {stream_id}] Created directory: {person_dir}")
Directories are created automatically when the first person is detected in each stream.

Performance Considerations

CPU/GPU Bottlenecks

Inference bottleneck:
  • Thread-safe lock means only one detection at a time
  • With 8 streams and 100ms inference time:
    • Each stream gets detection every 800ms minimum
    • Plus frame_skip delays
Solutions:
1

Enable GPU acceleration

CUDA-enabled OpenCV reduces inference from ~100ms to ~10ms.See GPU Acceleration Guide
2

Increase frame_skip

Process fewer frames per stream:
python main.py --rtsp-file streams.txt --save video \
  --frame-skip 30  # 1 fps instead of 2 fps
3

Adjust thresholds

Higher thresholds = fewer detections = less saving overhead:
--confidence 0.65 --area-threshold 2000
4

Disable display for more streams

Display rendering adds overhead:
python main.py --rtsp-file streams.txt --save video
# No --display flag

Memory Usage

Per-stream overhead:
  • Frame buffer: ~6 MB (1920×1080 RGB)
  • Video writer buffer: ~10-20 MB
  • Network buffers: ~5 MB
  • Total per stream: ~20-30 MB
Example:
  • 16 streams: ~400 MB
  • Plus model weights: ~250 MB (YOLOv4)
  • Total: ~650 MB baseline
Monitor memory with:
watch -n 1 'ps aux | grep python'

Network Bandwidth

Bandwidth calculation:
ResolutionBitrate (typical)StreamsTotal Bandwidth
1920×10804 Mbps416 Mbps
1920×10804 Mbps832 Mbps
1280×7202 Mbps816 Mbps
1280×7202 Mbps1632 Mbps
Ensure your network can handle aggregate bandwidth, especially on WiFi or shared switches.

Scaling Guidelines

StreamsCPU (no GPU)GPU (CUDA)RAMNetworkRecommendation
1-4OKExcellentLowLowAny hardware
5-8SlowGoodMediumMediumGPU recommended
9-16Very SlowOKHighHighGPU required
17+UnusableSlowVery HighVery HighMultiple instances

Console Output

Understanding multi-stream console output:
Loading person detection model...
CUDA available, using GPU for inference
Model loaded: YOLOv4
Confidence threshold: 0.5
Person area threshold: 1000 pixels

Config loaded from: config.cfg
  model_dir   = model
  output_dir  = output

Loaded 4 RTSP streams from streams.txt
Starting person detection on 4 stream(s)...
Created main directory: output

Started thread for stream 1: rtsp://192.168.1.100/stream
Started thread for stream 2: rtsp://192.168.1.101/stream
Started thread for stream 3: rtsp://192.168.1.102/stream
Started thread for stream 4: rtsp://192.168.1.103/stream

All streams shown in a single grid window. Press 'q' or Ctrl+C to stop...

[Stream 1] Connecting to: rtsp://192.168.1.100/stream
[Stream 2] Connecting to: rtsp://192.168.1.101/stream
[Stream 3] Connecting to: rtsp://192.168.1.102/stream
[Stream 4] Connecting to: rtsp://192.168.1.103/stream

[Stream 1] Connected successfully! Processing frames...
[Stream 1] Created directory: output/stream_1
[Stream 2] Connected successfully! Processing frames...
[Stream 2] Created directory: output/stream_2
[Stream 3] Connected successfully! Processing frames...
[Stream 3] Created directory: output/stream_3
[Stream 4] Connected successfully! Processing frames...
[Stream 4] Created directory: output/stream_4

[2026-03-09 14:30:22] [Stream 1] Frame 15: No persons
[2026-03-09 14:30:22] [Stream 2] Frame 15: 1 person(s) detected
[Stream 2]   Person entered frame! Entry #1
[Stream 2]   Detected 1 person(s) with boxes: [(450, 120, 180, 420)]
[Stream 2]   Started recording clip: output/stream_2/person_clip_1_20260309_143022_1741528222.mp4

[2026-03-09 14:30:23] [Stream 3] Frame 15: No persons
[2026-03-09 14:30:23] [Stream 4] Frame 15: 2 person(s) detected
[Stream 4]   Person entered frame! Entry #1
[Stream 4]   Detected 2 person(s) with boxes: [(200, 100, 150, 380), (800, 150, 160, 400)]
[Stream 4]   Saved snapshot: output/stream_4/person_entry_1_20260309_143023_1741528223.jpg

^C
Stopping all streams...

[Stream 1] Stopping detection...
[Stream 1] Processed 120 frames, captured 0 person clip(s)
[Stream 2] Stopping detection...
[Stream 2] Saved in-progress clip: output/stream_2/person_clip_1_20260309_143022_1741528222.mp4
[Stream 2] Processed 125 frames, captured 1 person clip(s)
[Stream 3] Stopping detection...
[Stream 3] Processed 118 frames, captured 0 person clip(s)
[Stream 4] Stopping detection...
[Stream 4] Processed 122 frames, captured 1 person snapshot(s)
Key indicators:
  • [Stream N] prefix identifies which stream each message is from
  • Thread start messages confirm all streams launched
  • Connection messages show parallel connection attempts
  • Detection events include stream ID and bounding box coordinates
  • Final summary shows per-stream statistics

Real-World Examples

Example 1: Retail Store (4 Cameras)

Setup:
  • Front entrance
  • Back entrance
  • Checkout area
  • Stock room
cameras.txt
rtsp://192.168.1.10/stream  # Front entrance
rtsp://192.168.1.11/stream  # Back entrance
rtsp://192.168.1.12/stream  # Checkout
rtsp://192.168.1.13/stream  # Stock room
python main.py --rtsp-file cameras.txt \
  --save image \
  --confidence 0.6 \
  --area-threshold 2000 \
  --display
Result:
  • Grid display shows all 4 cameras
  • Snapshot saved when person enters each area
  • Higher thresholds reduce false positives

Example 2: Warehouse (12 Cameras)

Setup:
  • Loading docks (4)
  • Main aisles (6)
  • Offices (2)
python main.py --rtsp-file warehouse_cams.txt \
  --save video \
  --frame-skip 30 \
  --confidence 0.55
# No --display for performance
Optimization:
  • No display (12 streams = too many for useful grid)
  • Higher frame_skip (30 = 1 fps) for performance
  • Video mode captures full activity
  • Run on server with GPU

Example 3: Office Building (8 Cameras)

Setup:
  • Lobby
  • Elevator banks (3)
  • Conference rooms (2)
  • Server room
  • Parking garage
python main.py --rtsp-file office.txt \
  --save video \
  --confidence 0.5 \
  --frame-skip 15 \
  --display
Configuration:
  • Balanced settings
  • Video clips for security review
  • Grid display for monitoring
  • Standard detection frequency

Troubleshooting

Issue: One bad URL causes problemsSolution: Threads are independent. A failing stream won’t crash others, but will keep retrying:
[Stream 3] Failed to read frame (attempt 1/5), reconnecting...
[Stream 3] Failed to read frame (attempt 2/5), reconnecting...
[Stream 3] Max reconnect attempts reached. Giving up.
Comment out the bad URL in your file:
# rtsp://broken-camera.local/stream
Issue: Some tiles frozen or blackPossible causes:
  • Stream connection issue
  • Thread crashed
  • Very slow inference
Debug: Check console for [Stream N] messages. Missing messages indicate that stream has issues.
Issue: Long delays between detectionsCause: Thread-safe inference lock is bottleneckSolutions:
  1. Enable GPU acceleration:
    # Reduces inference from ~100ms to ~10ms
    
    See GPU Acceleration
  2. Increase frame_skip:
    --frame-skip 30  # Reduce detection frequency
    
  3. Reduce number of streams: Split into multiple instances
Error:
MemoryError: Unable to allocate array
Solutions:
  1. Reduce number of streams
  2. Disable display: --display adds overhead
  3. Check available RAM:
    free -h
    
  4. Use lower resolution streams (configure at camera)
Issue: ‘q’ key doesn’t stop processingCause: Display window must have focusSolution:
  • Click on the grid window first
  • Then press ‘q’
  • Or use Ctrl+C in terminal

Best Practices

Use URL Files

Easier to manage, edit, and version control than command-line lists.

Test Streams First

Verify each RTSP URL works with VLC before adding to multi-stream setup.

Start Small, Scale Up

Test with 2-4 streams first, then add more as you tune performance.

Monitor System Resources

Use htop or similar to watch CPU, RAM, and network usage.

Enable GPU for >4 Streams

GPU acceleration is essential for processing many streams efficiently.

Label Your Streams

Use comments in URL file to document which camera is which.

Next Steps

GPU Acceleration

Essential for multi-stream performance

Configuration Tuning

Optimize settings for your camera setup

Single Stream Guide

Understand single-stream processing

Model Setup

Configure detection models

Build docs developers (and LLMs) love