Skip to main content
The Gold layer represents the final stage of the data warehouse pipeline, providing business-ready analytics models organized in a star schema. This layer is optimized for reporting, dashboards, and data analysis.

Purpose

The Gold layer delivers:

Star Schema

Dimensional model optimized for analytics queries

Business Context

Data enriched with business meaning and relationships

Query Performance

Denormalized structures for fast query response

Self-Service Analytics

Easy-to-understand models for business users

Star Schema Architecture

The Gold layer implements a classic star schema with one fact table and two dimension tables:
The star schema design allows for simple, efficient joins and intuitive query patterns for business analysts.

Dimension Tables

dim_customer

Provides a 360-degree view of customers by combining CRM and ERP data sources.
CREATE VIEW gold.dim_customer AS 
SELECT 
    ROW_NUMBER() OVER (ORDER BY cst_id) as customer_key,
    ci.cst_id as customer_id,
    ci.cst_key as customer_number,
    ci.cst_firstname as first_name,
    ci.cst_lastname as last_name,
    ci.cst_marital_status as marital_status,
    CASE 
        WHEN ci.cst_gndr != 'n/a' THEN ci.cst_gndr 
        ELSE coalesce(ca.gen, 'n/a') 
    END as gender,
    ca.bdate as birth_date,
    ci.cst_create_date as create_date
FROM silver.crm_cust_info ci
LEFT JOIN silver.erp_cust_az12 ca
    ON ci.cst_key = ca.cid
LEFT JOIN silver.erp_loc_a101 la
    ON ci.cst_key = la.cid;
Key Fields:
  • customer_key: Surrogate key for dimension
  • customer_id: Natural key from source system
  • customer_number: Business key for customer lookup
  • first_name, last_name: Customer name attributes
  • marital_status: Single, Married, or n/a
  • gender: Male, Female, or n/a (with ERP fallback)
  • birth_date: Date of birth from ERP system
  • create_date: Customer creation date in CRM
This dimension integrates data from three Silver tables:
  • silver.crm_cust_info: Primary customer attributes
  • silver.erp_cust_az12: Birth date and gender (fallback)
  • silver.erp_loc_a101: Location information
Gender is sourced from CRM, with ERP data used as fallback when CRM has ‘n/a’.
Uses ROW_NUMBER() to generate a sequential surrogate key:
ROW_NUMBER() OVER (ORDER BY cst_id) as customer_key
This provides a stable, integer key for fact table joins.

dim_product

Provides comprehensive product information with category hierarchy.
CREATE VIEW gold.dim_product AS
SELECT
    ROW_NUMBER() OVER (
        ORDER BY pn.prd_start_dt, pn.prd_key
    ) as product_key,
    pn.prd_id as product_id,
    pn.prd_key as product_number,
    pn.prd_name as product_name,
    pn.cat_id as category_id,
    pc.cat as category,
    pc.subcat as subcategory,
    pc.manteinance,
    pn.prd_cost as product_cost,
    pn.prd_line as product_line,
    pn.prd_start_dt as product_start_date
FROM silver.crm_prd_info pn
LEFT JOIN silver.erp_px_cat_g1v2 pc
    ON pn.cat_id = pc.id
WHERE pn.prd_end_dt IS NULL;
Key Fields:
  • product_key: Surrogate key for dimension
  • product_id: Natural key from source system
  • product_number: Business key for product lookup
  • product_name: Product name/description
  • category_id: Category identifier
  • category: Product category (e.g., Bikes, Accessories)
  • subcategory: Product subcategory for detailed grouping
  • manteinance: Maintenance classification
  • product_cost: Standard cost of product
  • product_line: Mountain, Road, Touring, Other sales
  • product_start_date: Product effective date
Product categorization is enriched from ERP data:
category_id → category → subcategory
This enables drill-down analysis from category to subcategory to individual products.
The dimension only includes currently active products:
WHERE pn.prd_end_dt IS NULL
Historical products with end dates are excluded from this view. Consider creating a separate view for historical product analysis.
Uses ROW_NUMBER() ordered by start date and product key:
ROW_NUMBER() OVER (
    ORDER BY pn.prd_start_dt, pn.prd_key
) as product_key

Fact Table

fact_sales

Contains sales transactions with foreign keys to dimension tables.
CREATE VIEW gold.fact_sales AS
SELECT
    si.sls_ord_num as order_number,
    pr.product_key,
    cu.customer_key,
    si.sls_ord_dt as order_date,
    si.sls_ship_dt as shipphing_date,
    si.sls_sales as sales_amount,
    si.sls_quantity as quantity,
    si.sls_price as price
FROM silver.crm_sales_details si
LEFT JOIN gold.dim_product pr
    ON si.sls_prd_key = pr.product_number
LEFT JOIN gold.dim_customer cu
    ON si.sls_cust_id = cu.customer_id;
Key Fields:
  • order_number: Sales order identifier (degenerate dimension)
  • product_key: Foreign key to dim_product
  • customer_key: Foreign key to dim_customer
  • order_date: Date order was placed
  • shipphing_date: Date order was shipped
Measures:
  • sales_amount: Total sales value
  • quantity: Number of units sold
  • price: Unit price
The grain of the fact table is:
One row per order line item - Each row represents a single product on a sales order.
This allows analysis at the order line level with the ability to aggregate to order or customer level.
The fact table provides these measure types:
These measures can be summed across all dimensions:
  • sales_amount: Total revenue
  • quantity: Total units sold
Fact table uses LEFT JOINs to dimensions:
LEFT JOIN gold.dim_product pr
    ON si.sls_prd_key = pr.product_number
LEFT JOIN gold.dim_customer cu
    ON si.sls_cust_id = cu.customer_id
LEFT JOINs preserve all sales transactions even if dimension lookups fail. Monitor for NULL dimension keys which indicate orphaned transactions.

Analytics Use Cases

SELECT 
    p.product_line,
    p.category,
    SUM(f.sales_amount) as total_sales,
    SUM(f.quantity) as total_units,
    COUNT(DISTINCT f.order_number) as order_count
FROM gold.fact_sales f
INNER JOIN gold.dim_product p 
    ON f.product_key = p.product_key
GROUP BY p.product_line, p.category
ORDER BY total_sales DESC;

Data Refresh

Gold layer views are automatically refreshed when queried since they’re based on Silver tables:
1

Load Bronze Layer

CALL bronze.load_bronze();
2

Load Silver Layer

CALL silver.load_silver();
3

Query Gold Views

Views automatically reflect latest Silver data:
SELECT * FROM gold.fact_sales;
Gold views are not materialized, so they always show current data from Silver tables. For better performance on large datasets, consider materializing these views.

Best Practices

Always Join on Keys

Use surrogate keys (customer_key, product_key) for joins, not natural keys

Filter Early

Apply date filters before joining to reduce dataset size

Index Dimensions

Consider indexing frequently filtered dimension attributes

Monitor Orphans

Check for NULL dimension keys in fact table

Next Steps

ETL Procedures

Learn how to orchestrate the complete pipeline

Silver Layer

Understand data transformations feeding Gold layer

Build docs developers (and LLMs) love