Skip to main content
The Bronze layer is the first stage of the data warehouse pipeline, responsible for ingesting raw data from source systems without any transformations. This layer serves as a historical archive of source system data.

Purpose

The Bronze layer provides:

Raw Data Storage

Preserves original data format and structure from source systems

Historical Archive

Maintains a complete history of data as received from sources

Audit Trail

Enables data lineage tracking and compliance requirements

Reprocessing

Allows downstream transformations to be rebuilt from source

Data Sources

The Bronze layer ingests data from two primary source systems:
Customer Relationship Management data including:
  • Customer information (cust_info.csv)
  • Product catalog (prd_info.csv)
  • Sales transactions (sales_details.csv)

Bronze Tables

The Bronze schema contains six tables that mirror the source system structures:

CRM Tables

Customer information from the CRM system.
CREATE TABLE bronze.crm_cust_info (
    cst_id INT,
    cst_key VARCHAR(50),
    cst_firstname VARCHAR(100),
    cst_lastname VARCHAR(100),
    cst_marital_status VARCHAR(100),
    cst_gndr VARCHAR(10),
    cst_create_date DATE
);
Key Fields:
  • cst_id: Customer identifier
  • cst_key: Customer business key
  • cst_gndr: Gender (raw codes: M/F)
  • cst_marital_status: Marital status (raw codes: S/M)
Product catalog from the CRM system.
CREATE TABLE bronze.crm_prd_info (
    prd_id INT,
    prd_key VARCHAR(50),
    prd_name VARCHAR(100),
    prd_cost DECIMAL(10, 2),
    prd_line CHAR(10),
    prd_start_dt DATE,
    prd_end_dt DATE
);
Key Fields:
  • prd_id: Product identifier
  • prd_key: Product business key (contains category prefix)
  • prd_line: Product line code (M/R/S/T)
  • prd_start_dt/prd_end_dt: Product validity period
Sales transaction details from the CRM system.
CREATE TABLE bronze.crm_sales_details (
    sls_ord_num VARCHAR(50),
    sls_prd_key VARCHAR(50),
    sls_cust_id INT,
    sls_ord_dt INT,
    sls_ship_dt INT,
    sls_due_dt INT,
    sls_sales INT,
    sls_quantity INT,
    sls_price DECIMAL(10, 2)
);
Date fields are stored as integers in YYYYMMDD format and require transformation in the Silver layer.

ERP Tables

Customer demographic data from the ERP system.
CREATE TABLE bronze.erp_cust_az12 (
    cid VARCHAR(50),
    bdate DATE,
    gen VARCHAR(10)
);
Customer location data from the ERP system.
CREATE TABLE bronze.erp_loc_a101 (
    cid VARCHAR(50),
    cntry VARCHAR(100)
);
Product category and maintenance information.
CREATE TABLE bronze.erp_px_cat_g1v2 (
    id VARCHAR(50),
    cat VARCHAR(50),
    subcat VARCHAR(50),
    manteinance VARCHAR(50)
);

Loading Process

Data is loaded into Bronze tables using PostgreSQL’s COPY command for efficient bulk loading:
1

Truncate Target Table

Each table is truncated before loading to ensure a clean load:
TRUNCATE TABLE bronze.crm_cust_info;
2

Execute COPY Command

Data is loaded directly from CSV files using the COPY command:
COPY bronze.crm_cust_info
  FROM '/datasets/source_crm/cust_info.csv'
  WITH (FORMAT csv, HEADER true);
3

Validate Load

The procedure tracks success/failure and execution time for each table load.

Loading Procedure

The bronze.load_bronze() procedure orchestrates the entire Bronze layer data load:
CALL bronze.load_bronze();
The procedure loads all six tables sequentially, with error handling for each table. If any load fails, the procedure raises an exception with detailed error information.

File Locations

Source files are organized by system:
/datasets/source_crm/
├── cust_info.csv
├── prd_info.csv
└── sales_details.csv

Next Steps

Silver Layer Transformations

Learn how Bronze data is cleansed and standardized in the Silver layer

ETL Procedures

Detailed documentation on ETL orchestration

Gold Layer

Business-ready analytics models

Build docs developers (and LLMs) love