Skip to main content

Overview

Media dumping extracts all WhatsApp media files from the device’s storage, preserving the directory structure for proper organization and HTML export compatibility. This includes images, videos, voice messages, audio files, documents, and other attachments.

Media Extraction Workflow

Media extraction is integrated into the backup dumping process:
1

User Confirmation

User chooses whether to include media during backup dump
2

Media Folder Discovery

Locates WhatsApp Media directories for each user/package
3

File Counting

Counts total files for progress tracking
4

Recursive Copy

Pulls all files with progress indication
5

Structure Preservation

Maintains original directory hierarchy

Triggering Media Extraction

During backup dumping in main.py:299:
mode = ui.ask("Include Media? (y/n)", default="n").lower()
include_media = mode == 'y'

if include_media:
    ui.print_info("Scanning for Media folders...")
    processed_media = set()
    for item in items_to_dump:
        key = f"{item['user']}_{item['pkg']}"
        if key in processed_media: continue
        
        media_path = self.device_manager.find_media(
            self.selected_device,
            item['user'],
            item['pkg']
        )
        if media_path:
            ui.print_info(f"Dumping Media for {item['pkg']}...")
            local_dir = os.path.join(
                "backups",
                self.selected_device,
                f"user_{item['user']}",
                item['type'],
                "Media"
            )
            
            if not self.device_manager.dump_media_with_progress(
                self.selected_device,
                media_path,
                local_dir
            ):
                ui.print_warning(f"Media dump incomplete for {item['pkg']}")
        processed_media.add(key)
Media dumping can take significant time (30+ minutes) for users with extensive media libraries (10,000+ files).

WhatsApp Media Folder Structure

WhatsApp organizes media into category-specific subdirectories:

Standard Media Paths

WhatsApp Messenger:
/storage/emulated/{user_id}/WhatsApp/Media/
├── WhatsApp Images/
├── WhatsApp Video/
├── WhatsApp Audio/
├── WhatsApp Voice Notes/
├── WhatsApp Documents/
├── WhatsApp Animated Gifs/
├── WhatsApp Stickers/
└── WhatsApp Profile Photos/
WhatsApp Business:
/storage/emulated/{user_id}/WhatsApp Business/Media/
└── (same structure as above)

Android 11+ Media Paths

/storage/emulated/{user_id}/Android/media/com.whatsapp/WhatsApp/Media/
/storage/emulated/{user_id}/Android/media/com.whatsapp.w4b/WhatsApp Business/Media/
Android 11 introduced scoped storage, moving app media to the Android/media/ directory.

Media Discovery

The find_media() method in core/device_manager.py:215 locates media folders:
def find_media(self, device_id: str, user_id: str, package: str) -> str:
    base = f"/storage/emulated/{user_id}/"
    paths = []
    if package == 'com.whatsapp':
        paths = [
            base + "WhatsApp/Media",
            base + "Android/media/com.whatsapp/WhatsApp/Media"
        ]
    elif package == 'com.whatsapp.w4b':
        paths = [
            base + "WhatsApp Business/Media",
            base + "Android/media/com.whatsapp.w4b/WhatsApp Business/Media"
        ]
        
    for path in paths:
        code, stdout, stderr = self.run_command(
            ['-s', device_id, 'shell', 'ls', '-d', f'"{path}"']
        )
        if code == 0 and "No such file" not in stdout and "No such file" not in stderr:
            return path
    return ""
The tool checks both legacy and modern paths, returning the first valid location found.

dump_media_with_progress() Implementation

The core extraction method in core/device_manager.py:254:

File Counting

def dump_media_with_progress(self, device_id: str, remote_path: str, local_dir: str):
    from tqdm import tqdm
    print_info("Counting files...", verbose=True)
    
    full_cmd = f'{self.adb_executable} -s {device_id} shell "find \'{remote_path}\' -type f | wc -l"'
    
    try:
        process = subprocess.run(
            full_cmd,
            shell=True,
            capture_output=True,
            text=True
        )
        total_files = int(process.stdout.strip()) if process.returncode == 0 else 0
    except:
        total_files = 0
    
    print_info(f"Total files: {total_files}")
Using find -type f | wc -l provides an accurate file count for progress tracking.

Recursive Pull with Progress Bar

if not os.path.exists(local_dir):
    os.makedirs(local_dir)

cmd_pull = [self.adb_executable, '-s', device_id, 'pull', f"{remote_path}/.", local_dir]
try:
    process = subprocess.Popen(
        cmd_pull,
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT,
        universal_newlines=True,
        encoding='utf-8',
        errors='replace'
    )
    pbar = tqdm(total=total_files, unit="file", desc="Pulling Media")
    for line in process.stdout:
        line = line.strip()
        if line.startswith("[") or "->" in line or ("/" in line and ":" not in line):
            pbar.update(1)
    process.wait()
    pbar.close()
    return process.returncode == 0
except Exception as e:
    print_error(f"Media dump failed: {e}")
    return False
The progress bar updates by parsing ADB’s output lines that indicate file transfers.

Media Types Extracted

Images

JPEG, PNG, WebP photos from chats and status

Videos

MP4, 3GP video messages and status videos

Audio

MP3, AAC, OGG audio files shared in chats

Voice Notes

Opus-encoded voice messages

Documents

PDF, DOC, XLS, ZIP and other file types

GIFs

Animated GIF files

Stickers

WebP sticker files

Profile Photos

Contact and group profile pictures

Directory Structure Preservation

The tool maintains WhatsApp’s original structure:
backups/{device_serial}/user_{user_id}/{type}/Media/
├── WhatsApp Images/
│   ├── IMG-20240301-WA0001.jpg
│   ├── IMG-20240301-WA0002.jpg
│   └── Sent/
│       └── IMG-20240301-WA0003.jpg
├── WhatsApp Video/
│   ├── VID-20240301-WA0001.mp4
│   └── Sent/
├── WhatsApp Voice Notes/
│   └── PTT-20240301-WA0001.opus
├── WhatsApp Audio/
│   └── AUD-20240301-WA0001.mp3
├── WhatsApp Documents/
│   └── DOC-20240301-WA0001.pdf
└── WhatsApp Profile Photos/
    └── ...
Preserving the structure ensures media links in HTML exports work correctly.

Media Linking to Chat Exports

HTML chat exports reference media using relative paths:
backups/{device}/user_0/messenger/
├── exports/
│   └── exported_chats.html  → References ../Media/...
└── Media/
    └── WhatsApp Images/
        └── IMG-20240301-WA0001.jpg

HTML Media Embed Code

From core/viewer.py:666:
if media_type.startswith('image/'):
    media_html = f'<div class="media-container"><a href="{rel_path}" target="_blank"><img src="{rel_path}" alt="Image" loading="lazy"></a></div>'
elif media_type.startswith('video/'):
    media_html = f'<div class="media-container"><video controls><source src="{rel_path}" type="{media_type}"></video></div>'
elif media_type.startswith('audio/'):
    media_html = f'<div class="media-container"><audio controls><source src="{rel_path}" type="{media_type}"></audio></div>'
The relative path ../Media/WhatsApp Images/... allows the HTML file to locate media regardless of where the backup folder is moved.

Storage Requirements

Media extraction can require substantial disk space:

Light User

500MB - 2GB
Few photos and videos

Average User

2GB - 10GB
Regular media sharing

Heavy User

10GB - 50GB+
Extensive video and photo collection
Ensure sufficient free disk space before initiating media extraction. The tool does not check available space beforehand.

Deduplication Handling

The tool prevents duplicate media extraction per user/package:
processed_media = set()
for item in items_to_dump:
    key = f"{item['user']}_{item['pkg']}"
    if key in processed_media:
        continue
    
    # Extract media...
    
    processed_media.add(key)
This ensures media is only downloaded once even when multiple database backups are selected from the same user/package.

Performance Considerations

Transfer Speed Factors

  • USB 2.0: ~10-30 MB/s (slower)
  • USB 3.0: ~50-100 MB/s (recommended)
  • File Size: Many small files are slower than fewer large files
  • Device Performance: Older devices may have slower storage access

Optimization Tips

1

Use USB 3.0

Connect via USB 3.0 port and cable for maximum speed
2

Disable MTP

Close file explorer windows accessing the device
3

Skip Media Initially

Extract databases first, then media separately if needed
4

Avoid Interruptions

Ensure device stays connected and screen doesn’t lock

Incomplete Extraction Handling

The method returns a boolean indicating success:
if not self.device_manager.dump_media_with_progress(
    self.selected_device,
    media_path,
    local_dir
):
    ui.print_warning(f"Media dump incomplete for {item['pkg']}")
Incomplete extraction may occur due to:
  • Device disconnection
  • Insufficient permissions for certain files
  • Storage space exhaustion
  • User cancellation (Ctrl+C)

Permissions and Access

Required Permissions

Read Storage

ADB must have access to storage paths

USB Debugging

Must remain enabled during extraction

Screen Unlocked

Some devices require screen to stay unlocked

No Password Prompt

File access shouldn’t require additional authorization

Android 11+ Considerations

# Tool checks both paths automatically
paths = [
    base + "WhatsApp/Media",  # Legacy
    base + "Android/media/com.whatsapp/WhatsApp/Media"  # Android 11+
]
On Android 11+, WhatsApp must have “All files access” permission for the legacy path to be readable.

Media File Naming Convention

WhatsApp uses standardized naming:
  • Images: IMG-YYYYMMDD-WA####.jpg
  • Videos: VID-YYYYMMDD-WA####.mp4
  • Audio: AUD-YYYYMMDD-WA####.mp3
  • Voice: PTT-YYYYMMDD-WA####.opus
  • Documents: DOC-YYYYMMDD-WA####.{ext}
Where:
  • YYYYMMDD: Date received/sent
  • WA####: Sequential WhatsApp number
  • {ext}: Original file extension
Files are automatically organized by type through subdirectories, making manual sorting unnecessary.

Troubleshooting

  • Ensure WhatsApp has been used to send/receive media
  • Check that storage permissions are granted to WhatsApp
  • Media may be stored on SD card (not supported via ADB)
  • Try manual backup creation in WhatsApp to populate folders
  • Grant “Files and media” permission to WhatsApp in Android Settings
  • On Android 11+, grant “All files access” in Special app access
  • Some manufacturers restrict ADB access to certain paths
  • Root access may be required on heavily locked devices
  • Check USB cable quality (use official or high-quality cable)
  • Switch to a different USB port (preferably USB 3.0)
  • Disable USB selective suspend in Windows power settings
  • Restart ADB server: adb kill-server && adb start-server
  • Consider extracting smaller batches or specific folders only
  • Verify Media folder exists alongside exports folder
  • Check that database references match actual filenames
  • Some media may have been deleted from device but references remain
  • Ensure HTML file and Media folder maintain relative structure

Manual Selective Extraction

For advanced users who need only specific media types:
# Extract only images
adb -s {device_serial} pull /storage/emulated/0/WhatsApp/Media/WhatsApp\ Images ./local_folder/

# Extract only videos
adb -s {device_serial} pull /storage/emulated/0/WhatsApp/Media/WhatsApp\ Video ./local_folder/

# Extract documents only
adb -s {device_serial} pull /storage/emulated/0/WhatsApp/Media/WhatsApp\ Documents ./local_folder/
Manual extraction requires proper escaping of spaces and may not preserve the tool’s expected directory structure.

Next Steps

Chat Export

Export chats with embedded media links

Database Viewer

View which media files are referenced in messages

Build docs developers (and LLMs) love