π Core Layer
The Core Layer is the main integration and transformation layer β where data from multiple sources is deduplicated, enriched, and prepared for reporting and analytics.

Why?β
- This is where the main business logic, mappings, and data transformations happen β errors here impact the entire warehouse and all downstream reporting.
- Ensures that only high-quality, well-integrated, and business-ready data moves on to semantic or reporting layers.
Where?β
- The Core Layer is positioned after Staging and before Semantic/Reporting layers.
Methodsβ
1. Data Completenessβ
Purpose: Ensures all expected and required records are present after integration, mapping, and transformation.
Why it matters: Missing records can signal data loss during ETL, incorrect joins, or filter logic errors. Missing data here means incomplete analytics and potentially wrong business decisions.
Example:
SELECT *
FROM core_orders
WHERE important_flag IS NULL;
2. Data Uniquenessβ
Purpose: Confirms that fields meant to be unique identifiers (e.g., customer_id, transaction_number) are truly unique after transformation and integration.
Why it matters: Duplicates can distort KPIs, cause double-counting, or break referential integrity, often appearing after merging multiple sources.
Example:
SELECT customer_id, COUNT(*)
FROM core_customers
GROUP BY customer_id
HAVING COUNT(*) > 1;
3. Referential Integrityβ
Purpose: Ensures all foreign keys reference existing primary keys.
Why it matters: Broken referential integrity causes failed joins, orphaned records, and incorrect analytics.
Example:
SELECT order_id
FROM core_orders
WHERE customer_id NOT IN (
SELECT customer_id FROM core_customers
);
4. Record Countβ
Purpose: Compares the number of records between layers (e.g., Staging β Core).
Why it matters: Large differences may signal join/filter issues or ETL bugs. Counts should remain consistent unless business rules explain changes.
Example:
SELECT COUNT(*) FROM staging_orders;
SELECT COUNT(*) FROM core_orders;
5. Attribute Mappingβ
Purpose: Ensures attributes (e.g., statuses, categories) are mapped according to business rules and reference tables.
Why it matters: Incorrect mappings cause misclassification, broken business logic, and reporting errors.
Example:
SELECT *
FROM core_orders
WHERE order_status NOT IN (
SELECT status FROM ref_statuses
);
6. Delta Processingβ
Purpose: Validates that only new or changed data is processed in incremental ETL loads.
Why it matters: Incorrect delta logic leads to missing updates or duplicate records.
Example:
SELECT *
FROM core_delta_load
WHERE load_timestamp > (
SELECT MAX(load_timestamp) FROM previous_loads
);
7. History Handlingβ
Purpose: Checks that historical changes (e.g., Slowly Changing Dimensions) are tracked correctly with valid date ranges.
Why it matters: Poor history handling causes inaccurate historical analytics and audit issues.
Example:
SELECT *
FROM core_customer_history
WHERE valid_to < valid_from;
8. History Start and End Datesβ
Purpose: Ensures all versioned records have valid and non-null date ranges.
Why it matters: Incorrect date ranges prevent accurate reconstruction of historical states.
Example:
SELECT *
FROM core_customer_history
WHERE valid_from IS NULL OR valid_to IS NULL;
9. Full Dump Processingβ
Purpose: Validates that full load jobs correctly reload all data into the Core Layer.
Why it matters: Faulty full loads can miss records or introduce duplicates, making Core inconsistent with upstream layers.
Example:
SELECT COUNT(*) FROM full_load_table;