Skip to main content
The Silver layer applies data quality rules, standardization, and cleansing transformations to raw Bronze data. This layer ensures data consistency and prepares data for business consumption.

Purpose

The Silver layer provides:

Data Cleansing

Removes duplicates, handles nulls, and corrects data quality issues

Standardization

Converts codes to readable values and standardizes formats

Validation

Applies business rules and data validation logic

Enrichment

Derives new fields and calculates computed values

Transformation Types

Deduplication

Removes duplicate records using window functions to keep only the most recent version:
SELECT *, 
  ROW_NUMBER() OVER (
    PARTITION BY cst_id 
    ORDER BY cst_create_date DESC
  ) AS flag_last
FROM bronze.crm_cust_info
WHERE flag_last = 1;
This pattern ensures only the latest customer record is retained based on the creation date.

Code Standardization

Converts abbreviated codes into human-readable values:
CASE upper(trim(cst_gndr)) 
  WHEN 'M' THEN 'Male' 
  WHEN 'F' THEN 'Female' 
  ELSE 'n/a'
END as cst_gndr

Date Formatting

Converts integer date values (YYYYMMDD format) to proper DATE types:
CASE
  WHEN sls_ord_dt IS NULL
    OR sls_ord_dt = 0
    OR length(sls_ord_dt::text) != 8
  THEN NULL
  ELSE to_date(TRIM(sls_ord_dt::text), 'YYYYMMDD')
END AS sls_ord_dt
Invalid dates (null, zero, or incorrect length) are converted to NULL to prevent data quality issues.

Data Validation and Correction

Sales Amount Validation

Ensures sales amounts are consistent with quantity and price:
CASE 
  WHEN sls_sales IS NULL 
    OR sls_sales <= 0 
    OR sls_sales != sls_quantity * abs(sls_price) 
  THEN sls_quantity * abs(sls_price)
  ELSE sls_sales
END as sls_sales

Price Correction

Handles negative or null prices:
CASE 
  WHEN sls_price IS NULL 
    OR sls_price < 0 
    OR sls_price = 0 
  THEN sls_sales / abs(sls_quantity)
  WHEN sls_price < 0 
  THEN abs(sls_price)
  ELSE sls_price
END as sls_price

Data Enrichment

Category Extraction

Extracts category ID from product key:
SELECT 
  prd_id,
  replace(substring(trim(prd_key), 1, 5), '-','_') as cat_id,
  substring(prd_key, 7, length(prd_key)) as prd_key,
  ...
FROM bronze.crm_prd_info

Product End Date Calculation

Uses LEAD window function to calculate product end dates:
lead(prd_start_dt) OVER (
  PARTITION BY prd_key 
  ORDER BY prd_start_dt
) - 1 as prd_end_dt

Customer ID Normalization

Removes prefixes and standardizes customer IDs:
CASE 
  WHEN cid LIKE 'NAS%' THEN substring(cid, 4, length(cid))
  ELSE cid
END as cid

Transformation Examples

Customer Information Transformation

SELECT 
  cst_id,
  cst_key,
  cst_firstname,
  cst_gndr,        -- 'M' or 'F'
  cst_marital_status  -- 'S' or 'M'
FROM bronze.crm_cust_info;

Sales Details Transformation

-- Dates stored as integers: 20240315
sls_ord_dt INT,
sls_ship_dt INT,
sls_sales INT,
sls_price DECIMAL(10, 2)

Silver Tables

Each Silver table includes a dwh_create_date timestamp for audit tracking:
Cleansed customer information with standardized gender and marital status codes.Added Transformations:
  • Deduplication by customer ID
  • Gender code expansion (M → Male, F → Female)
  • Marital status code expansion (S → Single, M → Married)
  • Trimmed whitespace from string fields
Product catalog with extracted category ID and calculated end dates.Added Transformations:
  • Category ID extracted from product key
  • Product line code expansion
  • Cost NULL handling (defaults to 0)
  • Product end date calculation using LEAD window function
Sales transactions with validated amounts and converted dates.Added Transformations:
  • Integer dates converted to DATE type
  • Sales amount validation against quantity × price
  • Price validation and correction
  • Invalid date handling
ERP customer demographics with normalized IDs.Added Transformations:
  • ‘NAS’ prefix removal from customer IDs
  • Future birth date validation
  • Gender code standardization
Location data with standardized country names.Added Transformations:
  • Hyphen removal from customer IDs
  • Country code standardization
Product category information (minimal transformation).

Loading Process

The silver.load_silver() procedure executes transformations:
1

Truncate Target Table

Each Silver table is truncated before loading
2

Execute Transformation

INSERT INTO statement with transformation logic applied
3

Log Progress

Execution time and status logged for each table
4

Handle Errors

Any failures trigger exception with detailed error message
CALL silver.load_silver();
The procedure processes all six tables sequentially with comprehensive error handling and performance logging.

Data Quality Rules

No Duplicates

Only the latest version of each record is retained

Valid Dates

All dates are validated and invalid values converted to NULL

Consistent Codes

All code values expanded to readable descriptions

Accurate Calculations

Sales amounts validated against quantity and price

Next Steps

Gold Layer Analytics

Learn how Silver data is modeled into star schema for analytics

Bronze Layer

Raw data ingestion layer

ETL Procedures

Detailed ETL orchestration documentation

Build docs developers (and LLMs) love