Database Architecture
The PruebaETL system uses a two-tier architecture:- Raw Tables: Store original data exactly as received from CSV files
- Normalized Tables: Store cleaned, validated, and relationally structured data
Raw Tables
Raw tables preserve original data without transformation. All columns use flexible data types (NVARCHAR) to accommodate any input format.clientes_raw
Stores original customer data before normalization.| Column | Type | Description |
|---|---|---|
cliente_id | INT | Original customer identifier |
nombre | NVARCHAR(255) | Customer name as provided |
ciudad | NVARCHAR(100) | City name (not normalized) |
segmento | NVARCHAR(50) | Business segment (not normalized) |
fecha_registro | NVARCHAR(50) | Registration date as string |
fecha_carga | DATETIME | Timestamp of data load |
- No PRIMARY KEY (data may contain duplicates or errors)
- No FOREIGN KEY constraints
fecha_cargaautomatically set to current timestamp- All business data stored as strings for maximum flexibility
ventas_raw
Stores original sales transactions before normalization.| Column | Type | Description |
|---|---|---|
venta_id | INT | Original sale identifier |
cliente_id | INT | Customer reference |
fecha | NVARCHAR(50) | Sale date as string |
total | NVARCHAR(50) | Sale amount as string (may include formatting) |
moneda | NVARCHAR(10) | Currency code |
canal | NVARCHAR(50) | Sales channel name |
fecha_carga | DATETIME | Timestamp of data load |
- No constraints or validations
- Numeric values stored as strings
- Dates stored as strings in original format
Normalized Tables
Normalized tables implement proper data types, constraints, and relationships following database normalization principles.Dimension Tables
ciudades
Stores unique cities referenced by customers.PRIMARY KEYonciudad_idIDENTITY(1,1)for auto-incrementing IDsUNIQUEconstraint onnombre(city names must be unique)NOT NULLonnombre
segmentos
Stores customer business segments.PRIMARY KEYonsegmento_idIDENTITY(1,1)for auto-incrementing IDsUNIQUEconstraint onnombreNOT NULLonnombre
canales
Stores sales channel definitions.PRIMARY KEYoncanal_idIDENTITY(1,1)for auto-incrementing IDsUNIQUEconstraint onnombreNOT NULLonnombre
monedas
Stores currency codes.PRIMARY KEYonmoneda_idIDENTITY(1,1)for auto-incrementing IDsUNIQUEconstraint oncodigoNOT NULLoncodigo
Fact Tables
clientes
Normalized customer table with foreign key relationships.| Column | Type | Constraints | Description |
|---|---|---|---|
cliente_id | INT | PRIMARY KEY | Unique customer identifier |
nombre | NVARCHAR(255) | NOT NULL | Customer name |
ciudad_id | INT | NOT NULL, FK | Reference to ciudades |
segmento_id | INT | NOT NULL, FK | Reference to segmentos |
fecha_registro | DATE | - | Registration date (proper date type) |
fecha_creacion | DATETIME | DEFAULT | Record creation timestamp |
fecha_actualizacion | DATETIME | DEFAULT | Record update timestamp |
- Uses business key as PRIMARY KEY (not auto-generated)
- Enforces referential integrity via FOREIGN KEYs
- Proper DATE type for
fecha_registro - Audit fields for tracking record lifecycle
ventas
Normalized sales transactions table.| Column | Type | Constraints | Description |
|---|---|---|---|
venta_id | INT | PRIMARY KEY | Unique sale identifier |
cliente_id | INT | NOT NULL, FK | Reference to clientes |
fecha | DATE | - | Sale date (proper date type) |
total | DECIMAL(18,2) | NOT NULL | Sale amount with 2 decimal places |
moneda_id | INT | NOT NULL, FK | Reference to monedas |
canal_id | INT | NOT NULL, FK | Reference to canales |
fecha_creacion | DATETIME | DEFAULT | Record creation timestamp |
- Multiple FOREIGN KEY relationships
- Proper DECIMAL type for monetary values
- DATE type for temporal queries
- Referential integrity enforced on all relationships
Indexes
Performance-optimized indexes on foreign keys and commonly queried columns.| Index | Type | Purpose |
|---|---|---|
IX_clientes_ciudad_id | Non-clustered | JOIN optimization with ciudades |
IX_clientes_segmento_id | Non-clustered | JOIN optimization with segmentos |
IX_ventas_cliente_id | Non-clustered | JOIN optimization with clientes |
IX_ventas_fecha | Non-clustered | Date range queries and aggregations |
IX_ventas_moneda_id | Non-clustered | JOIN optimization with monedas |
IX_ventas_canal_id | Non-clustered | JOIN optimization with canales |
- Faster JOIN operations on foreign keys
- Optimized date-based queries on sales data
- Improved aggregation performance
- Reduced query execution time for reports
IDENTITY Columns
Dimension tables use SQL Server’s IDENTITY property for auto-incrementing primary keys:- Seed: 1 (starting value)
- Increment: 1 (step size)
- Automatically generates unique sequential IDs
- Cannot insert explicit values without
SET IDENTITY_INSERT ON - Maintains sequence even after deletions
Entity Relationship Diagram
- One ciudad to many clientes
- One segmento to many clientes
- One cliente to many ventas
- One moneda to many ventas
- One canal to many ventas
Table Creation Order
Tables must be created in dependency order to satisfy foreign key constraints:-
Dimension Tables (no dependencies)
- ciudades
- segmentos
- canales
- monedas
-
Fact Tables (with dependencies)
- clientes (depends on ciudades, segmentos)
- ventas (depends on clientes, monedas, canales)
- Indexes (after all tables exist)
Next Steps
Schema Generation
Learn how DDL scripts are generated
SQL Server Deployment
Deploy the database to SQL Server