Overview
The DEFLATE/INFLATE module provides the high-level API for GZIP file compression and decompression. It combines LZ77 compression with Huffman coding according to RFC 1951 (DEFLATE) and RFC 1952 (GZIP file format).Constants
Program Modes
Block Types
Compression Levels
Compression Strategies
GZIP Header Flags
Fixed Huffman Code Definitions
Data Structures
gz_header_t
Represents a GZIP member header and trailer information.Compression method: 0 (no compression) or 8 (DEFLATE)
Bit flags indicating presence of optional fields (F_TEXT, F_HCRC, F_EXTRA, F_NAME, F_COMMENT)
Modification time as Unix timestamp (seconds since epoch), little-endian
Extra flags for compression level (2 = maximum compression, 4 = fastest)
Operating system identifier where compression took place
Length of extra field in bytes (only valid if F_EXTRA flag is set)
Pointer to extra field data, or
NULL if not present. Must be freed by callerPointer to zero-terminated original filename, or
NULL if not present. Must be freed by callerPointer to zero-terminated comment string, or
NULL if not present. Must be freed by callerCRC16 of the header (only valid if F_HCRC flag is set)
CRC32 of the uncompressed data, little-endian (from trailer)
Size of uncompressed data modulo 2^32, little-endian (from trailer)
block_header_t
Represents a compressed block header (conceptual structure, not actual C type).Functions
deflate
Compresses data using the DEFLATE algorithm and wraps it in GZIP format.Pointer to the filename string (used for metadata in GZIP header). Can be
NULLPointer to the input data buffer to compress
Length of the input data in bytes
Output parameter that receives the length of the compressed data in bytes
Pointer to a dynamically allocated buffer containing the GZIP-compressed data, or
NULL on errorBehavior
- Creates GZIP member header with metadata
- Computes CRC32 of input data
- Splits input into blocks of at most 65,535 bytes
- For each block:
- Applies LZ77 compression to create tokens
- Applies Huffman encoding (dynamic or fixed)
- Sets BFINAL flag on last block
- Appends GZIP trailer (CRC32 and uncompressed size)
Block Size Constraints
- Compressed blocks must hold at most 65,535 bytes when decompressed
- This ensures compatibility and consistent testing
- However, decompression must handle blocks of any size
Error Conditions
- Returns
NULLifbytesisNULL - Returns
NULLifout_lenisNULL - Returns
NULLif memory allocation fails - Returns
NULLif compression fails
Memory Management
The returned buffer is dynamically allocated usingmalloc(). The caller must free it using free().
Example
inflate
Decompresses GZIP-compressed data using the INFLATE algorithm.Pointer to the compressed data buffer (including GZIP header and trailer)
Length of the compressed data in bytes
Pointer to a dynamically allocated buffer containing the decompressed data, or
NULL on error. The size of the decompressed data is stored in the GZIP trailer’s full_size fieldBehavior
- Validates GZIP header magic bytes (0x1F 0x8B)
- Parses GZIP header metadata (optional fields)
- Processes compressed blocks sequentially:
- Reads 3-bit block header (BFINAL, BTYPE)
- Decodes based on block type (00, 01, or 10)
- Continues until BFINAL block is processed
- Validates CRC32 of decompressed data
- Validates uncompressed size matches trailer
Block Processing
No Compression (BTYPE = 00)
- Skip to byte boundary
- Read 16-bit length (LEN)
- Read 16-bit one’s complement (NLEN)
- Verify LEN == ~NLEN
- Copy LEN bytes directly
Fixed Huffman (BTYPE = 01)
- Use predefined code table (RFC 1951 Section 3.2.6)
- Decode symbols until end-of-block (256)
- Perform LZ77 decompression
Dynamic Huffman (BTYPE = 10)
- Read tree metadata (HLIT, HDIST, HCLEN)
- Decode code length tree
- Decode literal/length and distance trees
- Decode symbols until end-of-block (256)
- Perform LZ77 decompression
Error Conditions
- Returns
NULLifbytesisNULL - Returns
NULLif magic bytes are invalid - Returns
NULLif BTYPE = 11 (reserved) - Returns
NULLif CRC32 validation fails - Returns
NULLif size validation fails - Returns
NULLif Huffman decoding fails - Returns
NULLif LZ77 decompression encounters invalid distance codes (30, 31)
Memory Management
The returned buffer is dynamically allocated usingmalloc(). The caller must free it using free().
Example
parse_member
Parses a GZIP member header from a file stream.Pointer to an open file with the file position at the start of a GZIP member
Pointer to a
gz_header_t structure to receive the parsed header dataReturns 0 on success, -1 on error
Behavior
- Validates magic bytes (0x1F 0x8B)
- Reads compression method (CM)
- Reads flags byte (FLG)
- Reads modification time (MTIME, 4 bytes, little-endian)
- Reads extra flags (XFL)
- Reads OS identifier
- If F_EXTRA flag set:
- Reads extra field length (2 bytes, little-endian)
- Allocates and reads extra field data
- If F_NAME flag set:
- Reads zero-terminated filename string
- Allocates memory for filename
- If F_COMMENT flag set:
- Reads zero-terminated comment string
- Allocates memory for comment
- If F_HCRC flag set:
- Reads 2-byte header CRC
- Seeks to trailer (last 8 bytes)
- Reads CRC32 (4 bytes, little-endian)
- Reads uncompressed size (4 bytes, little-endian)
Error Conditions
- Returns -1 if
fileisNULL - Returns -1 if
headerisNULL - Returns -1 if magic bytes are incorrect
- Returns -1 if file read fails
- Returns -1 if memory allocation fails
Memory Management
The function allocates memory forextra, name, and comment fields if present. The caller is responsible for freeing these using free() when done.
Example
skip_gz_header_to_compressed_data
Skips the GZIP header and positions the file pointer at the compressed data.Pointer to an open file positioned at the start of a GZIP member
Pointer to a
gz_header_t structure to receive header metadataReturns 0 on success, -1 on error
GZIP File Format
Member Structure
A GZIP file consists of one or more “members” concatenated together:Header Layout
- ID1, ID2: Magic bytes (0x1F, 0x8B)
- CM: Compression method (8 = DEFLATE)
- FLG: Flags byte
- MTIME: Modification time (4 bytes, little-endian)
- XFL: Extra flags
- OS: Operating system
- XLEN + Extra: If F_EXTRA set
- Filename: If F_NAME set (zero-terminated)
- Comment: If F_COMMENT set (zero-terminated)
- CRC16: If F_HCRC set (2 bytes)
Trailer Layout
- CRC32: CRC-32 of uncompressed data (4 bytes, little-endian)
- ISIZE: Uncompressed size modulo 2^32 (4 bytes, little-endian)
Magic Number
Bit Packing
From RFC 1951:See Also
- LZ77 Compression API - Low-level LZ77 compression
- Huffman Coding API - Huffman encoding/decoding
- CRC API - CRC computation for GZIP