What are DAT Files?
DAT files are XML or text-based databases that catalog ROM collections. They contain metadata about every ROM in a set, including:- Game names and descriptions
- File hashes (CRC32, MD5, SHA-1) for verification
- File sizes to validate completeness
- Region information (USA, Europe, Japan, etc.)
- Version details and release dates
- Relationships between ROMs (clones, parents, BIOS files)
Why Use DAT Files?
Verification
Verify your ROM dumps are accurate and complete by comparing hashes
Organization
Automatically organize thousands of ROMs into a logical structure
Completeness
Identify missing ROMs from your collection
Updates
Track new releases and updated versions of ROMs
DAT File Formats
Datoso supports three main DAT file formats:XML Format
The most common format, used by Redump, No-Intro, and many others.- Human-readable XML structure
- Comprehensive metadata support
- Supports multiple ROMs per game
- Compatible with ROMVault, CLRMamePro
ClrMamePro Format
Text-based format used by older tools and some arcade sets.- Compact text format
- Key-value pairs in parentheses
- Multiple ROMs per game supported
- Used extensively in arcade preservation
DOSCenter Format
A variant of ClrMamePro format used by Total DOS Collection.- Similar to ClrMamePro with colon separators
- Optimized for DOS software
- Single-file game entries common
ROM Metadata Fields
Understanding the metadata in DAT files:name
name
The unique identifier for the game or ROM file. Often includes region and version information.Example:
Gran Turismo 4 (USA)description
description
Human-readable name of the game, may be more detailed than the name field.Example:
Gran Turismo 4 (USA) (NTSC)crc / crc32
crc / crc32
32-bit Cyclic Redundancy Check hash for quick verification.Example:
a4e8b3e2md5
md5
128-bit MD5 hash for file verification.Example:
d41d8cd98f00b204e9800998ecf8427esha1
sha1
160-bit SHA-1 hash, the most secure verification method.Example:
da39a3ee5e6b4b0d3255bfef95601890afd80709size
size
File size in bytes, used for quick validation before hashing.Example:
4699979776 (4.37 GB)mia (optional)
mia (optional)
“Missing In Action” flag indicating the ROM is known but unavailable.Example:
mia="yes"How Datoso Organizes ROMs
Datoso transforms flat DAT collections into a hierarchical folder structure optimized for emulators.Default Folder Structure
Organization Rules
DAT files are organized by:- Company: Manufacturer or platform owner (Nintendo, Sony, Sega, etc.)
- System: Specific platform or console (PlayStation 2, Game Boy Advance, etc.)
- Modifier: Additional categorization (Aftermarket, Translated, BIOS, etc.)
Path Generation Logic
Fromsrc/datoso/repositories/dat_file.py:104:
System Detection
Datoso uses seed-specific rules to automatically detect systems from DAT file names.Detection Process
- Parse DAT filename or header: Extract name like “Sony - PlayStation 2”
- Apply regex patterns: Match against known system patterns
- Lookup in systems.json: Find system metadata and overrides
- Apply seed rules: Use seed-specific detection logic
- Generate folder path: Create organized directory structure
System Types
Systems are classified by type insrc/datoso/systems.json:
- Console: Home gaming consoles (PS2, Xbox, Switch)
- Handheld: Portable gaming devices (Game Boy, PSP, 3DS)
- Computer: Home computers (Commodore 64, Amiga, PC)
- Arcade: Arcade machines (MAME, FBNeo)
- Other: Miscellaneous systems (V.Tech, Pinball, etc.)
System Overrides
The database supports overriding system properties:Modifiers and Suffixes
Modifiers categorize special ROM variants:- Common Modifiers
- Enhancement Modifiers
- Source Types
- Aftermarket - Unlicensed: Homebrew and unlicensed games
- Applications: Non-game software
- Audio: Soundtracks and audio CDs
- BIOS: System BIOS files
- Bonus Discs: Pack-in and promotional discs
- Coverdiscs: Magazine coverdiscs
- Demos: Demo and trial versions
- Educational: Educational software
- Multimedia: Multimedia applications
- Preproduction: Beta and prototype versions
- Promotional: Promotional releases
- Video: Video content
DAT Processing Workflow
When Datoso processes a DAT file:Deduplication
Datoso can deduplicate ROMs between parent and child DATs to save space and reduce redundancy.Parent-Child Relationships
Define a parent DAT for deduplication:How It Works
Fromsrc/datoso/repositories/dat_file.py:266:
AutoMerge
AutoMerge automatically deduplicates within a single DAT:MIA (Missing In Action) ROMs
Some ROMs are documented but unavailable for preservation. Datoso can mark these as MIA.Enabling MIA Processing
MIA Database
MIA records are stored in the database:How MIA Marking Works
Fromsrc/datoso/repositories/dat_file.py:230:
mia="yes" in the output DAT file.
DAT File Metadata in Database
Every processed DAT is stored in TinyDB with metadata:Querying DAT Metadata
Modifying DAT Properties
Static Paths
Override automatic path generation with a static path:static_path is set, Datoso will use it instead of the generated path.
Importing Existing DATs
Import DAT files you already have organized:- Scans
DatPathfor all.datfiles recursively - Attempts to detect which seed each DAT belongs to
- Parses each DAT file to extract metadata
- Stores metadata in the database with current file path
- Allows future updates without re-downloading
Import Configuration
Working with ROM Collections
Verification Workflow
-
Download DATs: Fetch latest DATs from sources
-
Process DATs: Organize into folder structure
- Use ROM Manager: Import DATs into ROMVault or CLRMamePro
- Verify ROMs: Use the ROM manager to check your collection against DATs
-
Update: Re-fetch and process periodically to get updates
Folder Organization Tips
Best Practices
Regular Updates
Regular Updates
Update DATs monthly or quarterly to stay current:
Backup Before Processing
Backup Before Processing
Keep backups of your organized DATs before processing updates:
Use Filters
Use Filters
Process only what you need to save time:
Enable Deduplication
Enable Deduplication
Save space by deduplicating demo and bonus disc DATs:
Next Steps
- Configuration - Configure DAT paths and processing options
- Commands Overview - Complete command-line reference
- Understanding Datoso - Deep dive into architecture