Audio Conversion Pipeline

Pipeline Overview

The ZZAR audio pipeline transforms user audio files through multiple stages to create game-compatible replacements:

Stage 1: Audio Normalization

ZZAR normalizes all input audio to ensure consistent volume levels:

# audio_converter.py:152-194
def any_to_wav(input_file, output_file, sample_rate=48000, 
               channels=2, normalize=True):
    cmd = [ffmpeg_path, '-i', str(input_file)]
    
    if normalize:
        # Normalize to -9 LUFS with -1.5 dB true peak
        cmd.extend(['-af', 'loudnorm=I=-9:TP=-1.5:LRA=11'])
    
    cmd.extend([
        '-acodec', 'pcm_s16le',  # 16-bit PCM
        '-ar', '48000',           # 48 kHz sample rate
        '-ac', '2',               # Stereo
        '-y',                     # Overwrite output
        str(output_file)
    ])
    
    subprocess.run(cmd, check=True)

Normalization Parameters

LUFS (Loudness Units Full Scale)

Target: -9 LUFSLUFS is a perceptual loudness measurement standard:

-9 LUFS = Moderately loud, suitable for games
More negative = quieter (e.g., -16 LUFS for streaming)
Less negative = louder (e.g., -6 LUFS for broadcasting)

ZZAR uses -9 LUFS to match typical game audio levels.

True Peak (TP)

Target: -1.5 dB TPTrue peak measures inter-sample peaks to prevent clipping:

-1.5 dB = Small headroom for codec artifacts
Prevents distortion during lossy compression
Required for Vorbis encoding quality

Loudness Range (LRA)

Target: 11 LUMeasures dynamic range of the audio:

11 LU = Moderate dynamic range
Preserves dynamics while controlling loudness
Prevents over-compression of music

Supported Input Formats

# audio_converter.py:271-276
audio_extensions = [
    '.mp3',   # MPEG Audio Layer 3
    '.flac',  # Free Lossless Audio Codec
    '.ogg',   # Ogg Vorbis
    '.m4a',   # MPEG-4 Audio
    '.aac',   # Advanced Audio Coding
    '.opus',  # Opus codec
    '.wma'    # Windows Media Audio
]

ZZAR automatically detects input format using FFmpeg’s built-in detection. No manual format specification needed.

Stage 2: WEM Encoding

Conversion to WEM format requires Audiokinetic Wwise:

# audio_converter.py:196-233
def wav_to_wem(wav_file, output_file=None, wwise_dir=None):
    if not wwise_console.is_installed():
        raise RuntimeError(
            "Wwise is not installed. "
            "Please install Wwise from the Settings page."
        )
    
    # Uses WwiseConsole.exe with pre-configured project
    result_wem = wwise.convert_to_wem(wav_file, output_dir)
    return result_wem

Wwise Project Structure

ZZAR uses a minimal Wwise project for conversion:

WAVtoWEM/
├── WAVtoWEM.wproj              # Wwise project file
├── GeneratedSoundBanks/        # Output directory (temporary)
├── Originals/                  # Wwise working directory
├── Conversion Settings/        # Audio codec settings
│   └── Default Conversion Settings.wwu
└── Actor-Mixer Hierarchy/      # Audio object hierarchy

Conversion Process

Create .wsources File

WwiseConsole requires an XML file listing source audio:

# wwise_wrapper.py:145-173
import xml.etree.ElementTree as ET

root = ET.Element("ExternalSourcesList", {
    "SchemaVersion": "1",
    "Root": wav_directory_path
})

for wav_file in wav_files:
    ET.SubElement(root, "Source", {
        "Path": wav_file.name,
        "Conversion": "Vorbis Quality High"
    })

tree = ET.ElementTree(root)
tree.write("list.wsources", encoding="utf-8")

Example .wsources file:

<?xml version="1.0" encoding="utf-8"?>
<ExternalSourcesList SchemaVersion="1" Root="C:\\Users\\User\\audio">
  <Source Path="voice_line_01.wav" Conversion="Vorbis Quality High"/>
  <Source Path="voice_line_02.wav" Conversion="Vorbis Quality High"/>
</ExternalSourcesList>

Execute WwiseConsole

# wwise_wrapper.py:189-213
cmd = [
    "WwiseConsole.exe",
    "convert-external-source",
    "WAVtoWEM.wproj",
    "--source-file", "list.wsources",
    "--output", output_directory
]

subprocess.run(cmd, check=True, capture_output=True)

On Linux/Mac, this runs through Wine:

# wwise_wrapper.py:199-210
cmd = [
    "wine64",
    "WwiseConsole.exe",
    "convert-external-source",
    "Z:\\path\\to\\WAVtoWEM.wproj",  # Wine path format
    "--source-file", "Z:\\path\\to\\list.wsources",
    "--output", "Z:\\path\\to\\output"
]

Retrieve WEM Output

# wwise_wrapper.py:226-244
wem_file = output_dir / wav_file.with_suffix(".wem").name

if not wem_file.exists():
    # Check project cache directory
    project_cache = project_path.parent / ".cache"
    for p in project_cache.rglob("*.wem"):
        shutil.copy(p, wem_file)
        break

Wwise may place output in subdirectories; ZZAR searches for the correct file.

Batch Conversion Optimization

For multiple files, ZZAR uses batch processing:

# wwise_wrapper.py:250-323
def batch_convert_to_wem(wav_files, output_dir):
    # Group files by parent directory for efficient processing
    wav_dir = wav_files[0].parent
    
    if all(f.parent == wav_dir for f in wav_files):
        # Single .wsources file with all WAVs
        wsources = create_wsources_file(wav_files, wav_dir, output_dir)
        convert_batch(wsources, output_dir)
    else:
        # Fall back to individual conversion
        for wav in wav_files:
            convert_to_wem(wav, output_dir)

Batch conversion is 10-50x faster than individual conversions due to reduced Wwise startup overhead.

Stage 3: PCK Injection

Depending on the target file type, ZZAR uses different injection methods:

3a. Direct Streamed WEM Replacement

For standalone WEM files in Streamed PCKs:

# pck_packer.py:164-203
def replace_file(file_id, replacement_file_path, lang_id=0, 
                 target_section='soundbank_files'):
    # Open replacement WEM file
    file_obj = open(replacement_file_path, 'rb')
    file_index = len(self.file_list)
    self.file_list.append(file_obj)
    
    # Get file size
    file_obj.seek(0, 2)
    file_size = file_obj.tell()
    file_obj.seek(0)
    
    # Update file mapping
    found_section[lang_id][file_id] = [(file_index, file_size, 0)]

Example:

packer = PCKPacker('Streamed_Music.pck', 'Streamed_Music_mod.pck')
packer.load_original_pck()
packer.replace_file(
    file_id=86631895,
    replacement_file_path='custom_music.wem',
    lang_id=0  # SFX/music language
)
packer.pack()

3b. BNK-Embedded WEM Replacement

For WEM files inside sound banks:

Extract Original BNK

# pck_packer.py:216-218
original_file.seek(offset)
bnk_bytes = original_file.read(size)
bnk = BNKFile(bnk_bytes=bnk_bytes)

Replace WEMs in BNK

# pck_packer.py:226-237
wem_files = list(Path(bnk_wems_dir).glob('*.wem'))

for wem_file in wem_files:
    wem_id = int(wem_file.stem)  # Get ID from filename
    bnk.replace_wem(wem_id, wem_path=wem_file)

Recalculate BNK Structure

# bnk_handler.py:219-236
def _correct_offsets(self):
    # Calculate DATA section start position
    self.data['DATA'].start_pos = (
        8 + len(self.data['BKHD']) +  # BKHD chunk
        8 + len(self.data['DIDX']) +  # DIDX chunk
        8                              # DATA header
    )
    
    # Rebuild DATA with alignment
    new_wem_offsets = self.data['DATA'].setdata()
    
    # Update DIDX with new offsets
    self.data['DIDX'].setdata(new_wem_offsets)

Insert Modified BNK into PCK

# pck_packer.py:243-252
modified_bnk_bytes = bnk.get_bytes()

# Store in memory for packing
temp_bnk = BytesIO(modified_bnk_bytes)
file_index = len(self.file_list)
self.file_list.append(temp_bnk)

# Update PCK file table
self.soundbank_titles[lang_id][bnk_id] = [
    (file_index, len(modified_bnk_bytes), 0)
]

Example:

packer = PCKPacker('SoundBank_SFX_1.pck', 'SoundBank_SFX_1_mod.pck')
packer.load_original_pck()

# Replace multiple WEMs in a BNK
packer.replace_bnk_wems(
    bnk_id=2882561007,
    bnk_wems_dir='./2882561007_bnk/',  # Directory with WEM files
    lang_id=1  # English
)

packer.pack()

Stage 4: PCK Packaging

ZZAR offers two packaging modes:

Patching Mode (Default)

Fast in-place modification:

# pck_packer.py:292-383
def pack_with_patching(self):
    # Copy original PCK
    shutil.copy2(original_pck, output_pck)
    
    # Open in read-write mode
    with open(output_pck, 'r+b') as f:
        for patch in patches:
            # Look up original file offset
            original_offset = index[patch['file_id']]['offset']
            original_size = index[patch['file_id']]['size']
            new_size = patch['new_size']
            
            # Seek and write
            f.seek(original_offset)
            
            if new_size <= original_size:
                f.write(new_data)
                if new_size < original_size:
                    # Pad with zeros
                    f.write(b'\x00' * (original_size - new_size))
            else:
                # Truncate if larger
                f.write(new_data[:original_size])

Patching limitations:

Cannot handle files larger than originals (will truncate)
Wastes space when replacing with smaller files
Fast but not ideal for significant size changes

Rebuild Mode

Complete reconstruction with optimal size:

# pck_packer.py:385-421
def pack_with_rebuild(self):
    with open(output_pck, 'wb') as f:
        # 1. Write header
        f.write(MAGIC)
        f.write(struct.pack('<6I', header_size, version,
                            sec1_size, sec2_size, sec3_size, sec4_size))
        
        # 2. Write language map
        f.write(build_language_map())
        
        # 3. Build file tables with recalculated offsets
        current_offset = header_size
        
        for section in [soundbank_titles, soundbank_files, stream_files]:
            write_info = build_file_table(section, current_offset)
            current_offset += sum(file_sizes)
        
        # 4. Write all audio data
        for file_index, size, offset in write_info:
            file_obj = self.file_list[file_index]
            file_obj.seek(offset)
            f.write(file_obj.read(size))

Advantages:

Optimal file size (no wasted space)
Supports any file size changes
Required for BNK modifications

Disadvantages:

Slower (full rewrite)
Creates completely new PCK file

Stage 5: Game Deployment

Final step: placing modified PCK in the game’s Persistent directory:

# mod_package_manager.py:506-590
def apply_mods(game_audio_dir, persistent_audio_dir):
    # Determine PCK location
    if (game_audio_dir / pck_name).exists():
        original_pck = game_audio_dir / pck_name
        output_pck = persistent_audio_dir / pck_name
    else:
        # Search subdirectories
        for subdir in game_audio_dir.iterdir():
            candidate = subdir / pck_name
            if candidate.exists():
                original_pck = candidate
                
                # Mirror subdirectory structure
                persistent_subdir = persistent_audio_dir / subdir.name
                persistent_subdir.mkdir(parents=True, exist_ok=True)
                output_pck = persistent_subdir / pck_name
                break
    
    # Process PCK with all mod replacements
    packer = PCKPacker(original_pck, output_pck)
    packer.load_original_pck()
    
    # Apply all enabled mod replacements
    for file_id, file_info in resolved_replacements[pck_name].items():
        if file_info.get('bnk_id'):
            # BNK modification path
            packer.replace_bnk_wems(bnk_id, wem_dir, lang_id)
        else:
            # Direct WEM replacement
            packer.replace_file(file_id, wem_path, lang_id)
    
    # Pack with rebuild mode (for BNK changes)
    packer.pack(use_patching=False)
    
    # Make read-only to prevent accidental modification
    output_pck.chmod(0o444)

Directory Structure

Game Installation/
├── ZenlessZoneZero_Data/
│   └── StreamingAssets/
│       └── Audio/
│           ├── GeneratedSoundBanks/
│           │   ├── SoundBank_SFX_1.pck      [Original]
│           │   └── SoundBank_Voice_1.pck    [Original]
│           └── Streamed/
│               └── Streamed_Music.pck       [Original]
└── Persistent/
    └── GeneratedSoundBanks/
        ├── SoundBank_SFX_1.pck              [Modified - loads instead]
        └── SoundBank_Voice_1.pck            [Modified - loads instead]

The game’s asset loader checks Persistent/ before StreamingAssets/, allowing non-destructive modding.

Complete Pipeline Example

End-to-end example of replacing a character voice line:

from audio_converter import AudioConverter
from pck_packer import PCKPacker
from pathlib import Path

# Initialize converter
converter = AudioConverter()

# Step 1: Convert user audio to normalized WAV
wav_file = converter.any_to_wav(
    input_file='custom_voice.mp3',
    output_file='custom_voice.wav',
    sample_rate=48000,
    channels=2,
    normalize=True  # -9 LUFS
)

# Step 2: Convert WAV to WEM
wem_file = converter.wav_to_wem(
    wav_file='custom_voice.wav',
    output_file='134133939.wem'
)

# Step 3: Create BNK directory structure
bnk_dir = Path('2882561007_bnk')
bnk_dir.mkdir(exist_ok=True)
shutil.copy('134133939.wem', bnk_dir / '134133939.wem')

# Step 4: Pack into PCK
packer = PCKPacker(
    original_pck_path='SoundBank_Voice_1.pck',
    output_pck_path='Persistent/GeneratedSoundBanks/SoundBank_Voice_1.pck'
)
packer.load_original_pck()
packer.replace_bnk_wems(
    bnk_id=2882561007,
    bnk_wems_dir=str(bnk_dir),
    lang_id=1  # English
)
packer.pack(use_patching=False)
packer.close()

print("Voice line replaced successfully!")

Performance Optimization

Memory Management

ZZAR uses file handles instead of loading entire files:

# pck_packer.py:42, 516-522
self.file_list = []  # Store file handles, not contents

# When writing:
for file_index, size, offset in write_info:
    file_obj = self.file_list[file_index]
    file_obj.seek(offset)
    data = file_obj.read(size)  # Read only what we need
    f.write(data)

Benefits:

Can process multi-GB PCK files
Constant memory usage
Faster for large files

Batch Processing

Batch conversion reduces Wwise startup overhead:

# Single conversion: ~3 seconds per file
for wav in wav_files:
    convert_to_wem(wav)  # 100 files = 300 seconds

# Batch conversion: ~5 seconds for all files
convert_batch(wav_files)  # 100 files = 5 seconds

Speedup: 60x faster for 100 files

Patching vs Rebuild

Operation	Patching	Rebuild
Replace 1 file	0.1s	5s
Replace 10 files	0.5s	5s
Replace 100 files	2s	8s
Handles size changes	❌	✅

Use patching for simple replacements, rebuild for BNK modifications.

Error Handling

Common pipeline errors and solutions:

WEM Conversion Failed

Error: RuntimeError: WEM file not createdCauses:

Wwise not installed
Invalid WAV format (must be 16-bit PCM)
Wine issues on Linux/Mac

Solution:

# Verify WAV format
subprocess.run(['ffmpeg', '-i', wav_file], capture_output=True)

# Re-convert to ensure correct format
converter.any_to_wav(audio_file, 'temp.wav', normalize=True)
converter.wav_to_wem('temp.wav', 'output.wem')

File Size Mismatch

Warning: ID 134133939 is larger than originalCause: Replacement audio is longer or higher quality than originalSolution:

# Option 1: Use rebuild mode
packer.pack(use_patching=False)

# Option 2: Trim audio to match original length
ffmpeg -i input.mp3 -t 5.2 -c:a copy output.mp3  # Trim to 5.2 seconds

BNK WEM Not Found

Error: KeyError: WEM ID 134133939 not found in BNKCause: Wrong WEM ID or wrong BNK fileSolution:

# List all WEMs in BNK
from bnk_handler import BNKFile
bnk = BNKFile('original.bnk')
print("Available WEM IDs:", bnk.list_wems())

# Extract BNK from PCK first
from pck_extractor import PCKExtractor
extractor = PCKExtractor('SoundBank_SFX_1.pck')
extractor.extract_all('./extracted/', extract_bnk=True)

Get Started

Features

Guides

Technical

Audio Conversion Pipeline

Pipeline Overview

Stage 1: Audio Normalization

Normalization Parameters

Supported Input Formats

Stage 2: WEM Encoding

Wwise Project Structure

Conversion Process

Batch Conversion Optimization

Stage 3: PCK Injection

3a. Direct Streamed WEM Replacement

3b. BNK-Embedded WEM Replacement

Stage 4: PCK Packaging

Patching Mode (Default)

Rebuild Mode

Stage 5: Game Deployment

Directory Structure

Complete Pipeline Example

Performance Optimization

Error Handling

Build docs developers (and LLMs) love

Get Started

Features

Guides

Technical

​Pipeline Overview

​Stage 1: Audio Normalization

​Normalization Parameters

​Supported Input Formats

​Stage 2: WEM Encoding

​Wwise Project Structure

​Conversion Process

​Batch Conversion Optimization

​Stage 3: PCK Injection

​3a. Direct Streamed WEM Replacement

​3b. BNK-Embedded WEM Replacement

​Stage 4: PCK Packaging

​Patching Mode (Default)

​Rebuild Mode

​Stage 5: Game Deployment

​Directory Structure

​Complete Pipeline Example

​Performance Optimization

​Error Handling

Build docs developers (and LLMs) love

Pipeline Overview

Stage 1: Audio Normalization

Normalization Parameters

Supported Input Formats

Stage 2: WEM Encoding

Wwise Project Structure

Conversion Process

Batch Conversion Optimization

Stage 3: PCK Injection

3a. Direct Streamed WEM Replacement

3b. BNK-Embedded WEM Replacement

Stage 4: PCK Packaging

Patching Mode (Default)

Rebuild Mode

Stage 5: Game Deployment

Directory Structure

Complete Pipeline Example

Performance Optimization

Error Handling