Skip to main content

ZZAR Mod Package Format (.zzar)

ZZAR mod packages are ZIP archives with a standardized structure:
mod_package.zzar
├── metadata.json           # Required: Mod metadata and file mappings
├── thumbnail.png           # Optional: Mod preview image
└── wem_files/              # Directory containing WEM audio files
    ├── 134133939.wem
    ├── 86631895.wem
    └── ...

metadata.json Schema

{
  "format_version": "1.0",
  "name": "Mod Name",
  "author": "Author Name",
  "version": "1.0.0",
  "description": "Mod description",
  "created_date": "2024-03-15T10:30:00",
  "zzar_version": "1.0.0",
  "thumbnail": "thumbnail.png",
  "replacements": {
    "PCK_filename.pck": {
      "file_id": {
        "wem_file": "wem_files/134133939.wem",
        "sound_name": "Sound description",
        "lang_id": 0,
        "bnk_id": 2882561007,
        "file_type": "wem"
      }
    }
  }
}
FieldTypeRequiredDescription
format_versionstringYesZZAR package format version (currently “1.0”)
namestringYesMod display name
authorstringYesMod creator name
versionstringYesMod version (semantic versioning recommended)
descriptionstringNoDetailed mod description
created_datestringNoISO 8601 timestamp
zzar_versionstringNoZZAR version used to create the package
thumbnailstringNoRelative path to thumbnail image
replacementsobjectYesFile replacement mappings (see below)

Replacements Object Structure

"replacements": {
  "SoundBank_SFX_1.pck": {           // PCK filename
    "134133939": {                    // WEM file ID (string)
      "wem_file": "wem_files/134133939.wem",
      "sound_name": "Character Voice Line",
      "lang_id": 1,                   // 0=SFX, 1=English, 2=Chinese, 3=Japanese, 4=Korean
      "bnk_id": 2882561007,           // BNK ID if file is inside a sound bank
      "file_type": "wem"              // File type (always "wem" currently)
    }
  },
  "Streamed_Music.pck": {
    "86631895": {
      "wem_file": "wem_files/86631895.wem",
      "sound_name": "Background Music",
      "lang_id": 0,
      "bnk_id": null,                 // null = streamed WEM (not in BNK)
      "file_type": "wem"
    }
  }
}
bnk_id field: When null or omitted, the WEM is a standalone streamed file. When set to a numeric ID, the WEM is embedded inside that BNK file.

Validation

ZZAR validates mod packages during installation:
# mod_package_manager.py:82-123
# 1. Check if file is a valid ZIP
if not zipfile.is_zipfile(zzar_path):
    raise InvalidModPackageError("Not a valid ZIP file")

# 2. Verify metadata.json exists
if 'metadata.json' not in zf.namelist():
    raise InvalidModPackageError("Missing metadata.json")

# 3. Validate required fields
required_fields = ['name', 'author', 'version', 'replacements']
for field in required_fields:
    if field not in metadata:
        raise InvalidModPackageError(f"Missing required field: {field}")

# 4. Verify referenced WEM files exist in archive
for wem_file in metadata['replacements'][pck_name][file_id]['wem_file']:
    if wem_file not in zf.namelist():
        raise InvalidModPackageError(f"Referenced file not found: {wem_file}")

PCK (Package) Format

PCK files are Audiokinetic’s proprietary package format for bundling Wwise audio assets.

Binary Structure

struct PCK_Header {
    char magic[4];           // "AKPK"
    uint32_t header_size;    // Total size of header in bytes
    uint32_t version;        // Format version (typically 1)
    uint32_t sec1_size;      // Language strings section size
    uint32_t sec2_size;      // Banks section size (BNK files)
    uint32_t sec3_size;      // Sounds section size (WEM files)
    uint32_t sec4_size;      // External files section size (optional)
};

struct Language_Entry {
    uint32_t string_offset;  // Offset to language name (UTF-16 LE)
    uint32_t lang_id;        // Language identifier
};

struct File_Entry {
    uint32_t file_id;        // File identifier (or uint64_t for externals)
    uint32_t blocksize;      // Offset multiplier
    uint32_t size;           // File size in bytes
    uint32_t offset_block;   // Offset = offset_block * blocksize
    uint32_t lang_id;        // Language ID
};

Section Layout

[PCK Header (28 bytes)]
[Section 1: Language Strings]
  - Language count (4 bytes)
  - Language entries (8 bytes each)
  - Language name strings (UTF-16 LE, null-terminated)
[Section 2: Banks Table]
  - File count (4 bytes)
  - File entries (20 bytes each, 32-bit IDs)
[Section 3: Sounds Table]
  - File count (4 bytes)
  - File entries (20 bytes each, 32-bit IDs)
[Section 4: Externals Table] (optional)
  - File count (4 bytes)
  - File entries (24 bytes each, 64-bit IDs)
[Audio Data]
  - All audio files stored sequentially
PCK uses a block-based offset calculation:
# pck_extractor.py:68-71
if blocksize != 0:
    offset = offset_block * blocksize
else:
    offset = offset_block  # Direct offset
This allows efficient storage of large files by aligning them to block boundaries.Example:
  • blocksize = 2048 (2 KB blocks)
  • offset_block = 1000
  • Actual offset = 1000 * 2048 = 2,048,000 bytes

PCK Types in ZZZ

Zenless Zone Zero uses two types of PCK files:
TypeLocationContents
SoundBank PCKsGeneratedSoundBanks/BNK files containing short audio clips (SFX, voices)
Streamed PCKsStreamed/Individual WEM files for music and long audio

BNK (SoundBank) Format

BNK files are Wwise sound banks containing multiple audio files and metadata.

Binary Structure

struct BNK_Chunk {
    char tag[4];             // Chunk identifier (e.g., "BKHD")
    uint32_t size;           // Chunk data size
    uint8_t data[size];      // Chunk data
};

struct DIDX_Entry {
    uint32_t wem_id;         // WEM file identifier
    uint32_t offset;         // Offset into DATA section
    uint32_t size;           // WEM file size
};

Chunk Types

Contains metadata about the sound bank:
struct BKHD_Data {
    uint32_t version;        // Bank format version
    uint32_t bank_id;        // Sound bank identifier
    // Additional metadata...
};
ZZAR preserves the BKHD chunk unchanged when modifying BNK files.
Index of all WEM files in the DATA section:
[Number of WEM files: 3]

[WEM ID: 134133939] [Offset: 0]      [Size: 45280]
[WEM ID: 523189445] [Offset: 45296]  [Size: 89344]
[WEM ID: 889234567] [Offset: 134656] [Size: 112640]
Each entry is exactly 12 bytes. ZZAR recalculates this section when WEM files are replaced.
Contains all WEM audio files stored sequentially:
[WEM 134133939 data: 45280 bytes]
[Padding: 16 bytes]  ← Align to 16-byte boundary
[WEM 523189445 data: 89344 bytes]
[Padding: 0 bytes]   ← Already aligned
[WEM 889234567 data: 112640 bytes]
[No padding]         ← Last file doesn't need padding
Defines relationships between sound objects, events, and audio files. ZZAR preserves this chunk unchanged.
struct HIRC_Data {
    uint32_t num_objects;    // Number of hierarchy objects
    // Variable-length object definitions
};

WEM Replacement Process

# bnk_handler.py:203-221
# 1. Replace WEM data in memory
self.data['DATA'].wem_data[wem_id] = WEM(new_wem_bytes)

# 2. Recalculate DATA section with proper alignment
data_section.start_pos = 8 + len(BKHD) + 8 + len(DIDX) + 8
new_wem_offsets = data_section.setdata()  # Returns new offsets

# 3. Update DIDX with new offsets and sizes
self.data['DIDX'].setdata(new_wem_offsets)

# 4. Serialize to bytes
result = BKHD.getdata() + DIDX.getdata() + DATA.getdata() + HIRC.getdata()

WEM (Wwise Encoded Media) Format

WEM is Audiokinetic’s proprietary audio format based on various codecs:
  • Vorbis: Most common, compressed audio (variable bitrate)
  • ADPCM: Lower quality, minimal CPU usage
  • PCM: Uncompressed, highest quality
  • Opus: High efficiency, modern codec

WEM File Structure

WEM files are RIFF containers:
struct RIFF_Header {
    char riff[4];            // "RIFF"
    uint32_t size;           // File size - 8
    char wave[4];            // "WAVE"
};

struct FMT_Chunk {
    char fmt[4];             // "fmt "
    uint32_t size;           // Chunk size (usually 16 or 18)
    uint16_t format;         // Audio format (1=PCM, 0xFFFF=Vorbis)
    uint16_t channels;       // Number of channels
    uint32_t sample_rate;    // Sample rate in Hz
    uint32_t byte_rate;      // Bytes per second
    uint16_t block_align;    // Bytes per sample frame
    uint16_t bits_per_sample;// Bits per sample
};

// Vorbis-specific chunks
struct VORB_Chunk {
    // Vorbis setup and codec information
};

struct DATA_Chunk {
    char data[4];            // "data"
    uint32_t size;           // Audio data size
    uint8_t audio_data[size];// Encoded audio data
};

Conversion Pipeline

ZZAR uses external tools to work with WEM files:
1

Decoding WEM to WAV

# audio_converter.py:86-150
vgmstream-cli -o output.wav input.wem
# OR
ffmpeg -i input.wem -acodec pcm_s16le -ar 48000 output.wav
vgmstream-cli is preferred as it has native WEM support.
2

Encoding WAV to WEM

# wwise_wrapper.py:175-248
# Uses Wwise's WwiseConsole.exe
WwiseConsole.exe convert-external-source \
  project.wproj \
  --source-file list.wsources \
  --output output_dir/
Requires Audiokinetic Wwise installation (included with ZZAR setup).

WEM Conversion Settings

ZZAR uses these Wwise conversion settings:
<!-- WAVtoWEM.wproj conversion settings -->
<Conversion name="Vorbis Quality High">
  <Format>Vorbis</Format>
  <Quality>High</Quality>
  <SampleRate>48000</SampleRate>
  <Channels>Stereo</Channels>
  <NormalizeLoudness>true</NormalizeLoudness>
</Conversion>
ZZAR automatically normalizes audio to -9 LUFS before conversion to ensure consistent volume levels across replacements.

File Naming Conventions

ZZAR uses numeric IDs for file identification:
134133939.wem              ← WEM file (ID as filename)
2882561007_bnk/            ← BNK directory (ID + _bnk suffix)
  ├── 134133939.wem        ← WEM files inside this BNK
  ├── 523189445.wem
  └── 889234567.wem

ID Types

ID TypeBitsRangeUsage
File ID320 to 4,294,967,295WEM and BNK files in PCK
External ID640 to 18,446,744,073,709,551,615External streamed files
Lang ID320 to 4,294,967,295Language identifier
File IDs are NOT sequential or predictable. They are generated by Wwise based on asset names and settings.

Byte Order and Encoding

All ZZAR-supported formats use:
  • Byte Order: Little-endian (Intel format)
  • Integer Sizes:
    • uint32_t = 4 bytes
    • uint64_t = 8 bytes
  • Text Encoding: UTF-16 LE (PCK language strings), UTF-8 (metadata.json)
# Python struct format strings used in ZZAR
'<I'   # uint32_t little-endian
'<Q'   # uint64_t little-endian
'<4I'  # Four consecutive uint32_t values

Build docs developers (and LLMs) love