Introduction
In this assignment, you will debug and optimize a program to interact with gzip-compressed files. This builds on the previous PNG homework, as PNG images use the gzip compression format.What is GZIP?
GZIP is a file format used for lossless data compression. The compression is achieved through two main algorithms:- LZ77 - Replaces repeated sequences with references to their last occurrence
- Huffman Coding - Encodes frequently occurring bytes with fewer bits
Learning Objectives
This homework is designed to familiarize you with:- Debugging and profiling C programs
- Algorithms and data structures in C
- Buffered I/O operations
- Memory allocation and management
- Bit-level data manipulation
- Understanding compression algorithms
Specifications
The official specifications for the GZIP format are available at:- RFC 1951 - DEFLATE block format and algorithms
- RFC 1952 - GZIP file format
- zlib.net/feldspar.html - Simplified algorithm descriptions
Assignment Structure
The assignment is divided into several parts:Part 1: Argument Validation
Implement command-line argument processing for compress, decompress, and info operations
Part 2: LZ77 Compression
Implement the Lempel-Ziv compression algorithm to find and encode repeated sequences
Code Organization
The base code is organized as follows:The
main.c file MUST ONLY contain #includes, local #defines, and the main function. All other functions must be in separate .c files. This restriction is required for Criterion testing.Allowed Libraries
You may use:glibc(GNU standard C library)- Dynamic memory allocation (
malloc,calloc,free) - Standard I/O functions (
fread,fwrite, etc.)
Output Requirements
- For normal operation, follow the exact format specified for each mode
- Use the
debugmacro for debugging output (only shown when compiled withmake debug) - In production mode, no extraneous output is allowed
- Successful execution: exit with
EXIT_SUCCESS(0) - Error conditions: print one-line error to stderr and exit with
EXIT_FAILURE(1)
Building the Project
Use the provided Makefile:Testing
The tests in the template repository are for basic functionality checks only. They are NOT the same tests used for grading. You should write your own comprehensive tests to ensure your code handles all required functionality.
Getting Help
Reference materials:Next Steps
Continue to the following sections to learn about:- Assignment requirements and deliverables
- Grading criteria
- Command-line argument processing
- LZ77 algorithm implementation
- Huffman coding implementation
- GZIP file format and DEFLATE/INFLATE algorithms