Skip to main content

πŸ“‹ Test Approaches Catalog

This page lists all available X-AutoMate Test Approaches β€” the standardized checks you can apply across every layer of your data warehouse. Each approach targets a specific DQ Dimension and can be configured for your tables, columns, and business rules.

Note: Many approaches are generic and reusable across layers β€” from Staging, through Core, to Semantic and Reporting. As shown in the Data Model & Architecture page, data flows through multiple layers with transformations at each step. Approaches such as value_unique, value_missing_null, value_empty, reconciliation_count, refferential_integrity, and attribute_mapping can be applied at any layer to enforce quality consistently throughout the pipeline. Layer-specific pages (Core Layer, Semantic Layer, etc.) show how these generic approaches are applied in each context.


🎯 Accuracy​

aggregation_count​

Verifies aggregated row counts (by filter/grouping) match the expected result or reference source.

Example:

SELECT region, COUNT(*) AS cnt
FROM core_sales
GROUP BY region
HAVING cnt <> (SELECT expected_count FROM ref_counts WHERE ref_counts.region = core_sales.region);

aggregation_custom​

Validates user-defined aggregations (e.g., sum/avg/min/max) return expected values for specified slices.

Example:

SELECT department, SUM(salary) AS total
FROM core_employees
GROUP BY department
HAVING SUM(salary) <> (SELECT expected_total FROM ref_budgets WHERE ref_budgets.department = core_employees.department);

reconciliation_count​

Compares record counts across systems/stages (optionally filtered/grouped) to detect losses or duplications.

Example:

SELECT 'staging' AS layer, COUNT(*) FROM staging_orders
UNION ALL
SELECT 'core', COUNT(*) FROM core_orders;

value_maximum​

Ensures values do not exceed a defined maximum threshold.

Example:

SELECT *
FROM orders
WHERE discount_pct > 100;

value_minimum​

Ensures values are not below a defined minimum threshold.

Example:

SELECT *
FROM products
WHERE price < 0;

value_out_of_range​

Validates numeric/date values fall within an expected range for the domain or reference expectation.

Example:

SELECT *
FROM employees
WHERE age NOT BETWEEN 18 AND 120;

πŸ“¦ Completeness​

value_empty​

Checks required fields are not empty/blank and meet the minimum expected population.

Example:

SELECT *
FROM customer
WHERE TRIM(first_name) = '' OR first_name IS NULL;

value_empty_custom​

Checks emptiness using a custom definition (e.g., '', whitespace, 'N/A', sentinel values).

Example:

SELECT *
FROM customer
WHERE email IN ('', 'N/A', 'UNKNOWN') OR email IS NULL;

value_missing_custom​

Detects missing required records/values using custom business keys or expected entity sets.

Example:

SELECT expected.code
FROM (VALUES ('PRIVATE'), ('CORPORATE')) AS expected(code)
LEFT JOIN ref_customer_type rct ON expected.code = rct.code
WHERE rct.code IS NULL;

value_missing_null​

Detects NULLs where values are required (or NULL rate exceeds the acceptable threshold).

Example:

SELECT COUNT(*)
FROM orders
WHERE customer_id IS NULL;

πŸ”„ Consistency​

attribute_mapping​

Confirms source-to-target attribute mappings (name/type/derivation) are correct and values align across systems.

Example:

SELECT c.id, c.status, s.status
FROM core_orders c
JOIN semantic_orders s ON c.id = s.id
WHERE c.status <> s.status;

processing_delta​

Checks incremental (delta/CDC) loads apply the correct inserts/updates/deletes without missing or double-processing records.

Example:

SELECT *
FROM core_delta_load
WHERE load_timestamp > (
SELECT MAX(load_timestamp) FROM previous_loads
);

processing_full_dump​

Validates full-refresh loads are complete and consistent with the source snapshot for the same run.

Example:

SELECT COUNT(*) FROM staging_full_load;
SELECT COUNT(*) FROM core_full_load;

processing_history​

Validates historical processing/backfills preserve correct versions and time ranges (no gaps, overlaps, or broken lineage).

Example:

SELECT *
FROM core_customer_history
WHERE valid_to < valid_from;

πŸ”— Integrity​

reconcilation_e2e​

Reconciles key totals and/or record-level results end-to-end across the pipeline (source β†’ transforms β†’ target).

Example:

SELECT COUNT(*)
FROM (
SELECT md5(concat_ws('|', *)) AS hash
FROM core_data
EXCEPT
SELECT md5(concat_ws('|', *)) AS hash
FROM semantic_data
) AS diff;

refferential_integrity​

Checks that reference/foreign-key values resolve to existing parent entities (no orphaned relationships).

Example:

SELECT order_id
FROM core_orders
WHERE customer_id NOT IN (
SELECT customer_id FROM core_customers
);

⏱ Timeliness​

processing_fresh​

Ensures datasets are produced or updated within the expected SLA window (freshness vs schedule).

Example:

SELECT table_name, last_updated
FROM metadata_freshness
WHERE last_updated < CURRENT_DATE - INTERVAL '1 day';

πŸ”‘ Uniqueness​

value_unique​

Ensures specified key(s) uniquely identify records (no duplicate keys).

Example:

SELECT customer_id, COUNT(*)
FROM core_customers
GROUP BY customer_id
HAVING COUNT(*) > 1;

πŸ“ Validity​

custom_query​

Runs a custom query whose output must satisfy an assertion (e.g., zero violations, threshold met, expected set).

Example:

SELECT *
FROM orders
WHERE total_amount <> (line_amount + tax_amount);

structure_discovery_custom​

Discovers and validates expected schema characteristics (columns, types, constraints) using custom rules.

Example:

SELECT column_name, data_type
FROM information_schema.columns
WHERE table_name = 'core_orders'
AND column_name NOT IN ('id', 'order_date', 'customer_id', 'amount');

structure_discovery_datetime​

Discovers and validates datetime columns follow expected formats and constraints.

Example:

SELECT *
FROM orders
WHERE order_date IS NULL
OR order_date > CURRENT_DATE;

structure_discovery_email​

Validates email fields follow the expected structure and can be parsed consistently.

Example:

SELECT *
FROM users
WHERE email NOT REGEXP '^[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,}$';

structure_discovery_phone​

Validates phone fields follow the expected structure (e.g., country code, length, allowed characters).

Example:

SELECT *
FROM contacts
WHERE phone NOT REGEXP '^\+?[0-9]{7,15}$';

value_custom​

Applies custom business rules (including cross-field checks/lookups) to flag invalid values.

Example:

SELECT *
FROM orders
WHERE ship_date < order_date;

value_illegal​

Flags values that violate allowed sets/patterns (e.g., invalid enums, forbidden codes, disallowed characters).

Example:

SELECT *
FROM orders
WHERE status NOT IN ('NEW', 'PROCESSING', 'SHIPPED', 'CANCELLED');

πŸ“Œ Quick Reference​

Test ApproachDescriptionDQ Dimension
aggregation_countVerifies aggregated row counts match expected result or reference sourceAccuracy
aggregation_customValidates user-defined aggregations return expected values for specified slicesAccuracy
attribute_mappingConfirms source-to-target attribute mappings are correct across systemsConsistency
custom_queryRuns a custom query whose output must satisfy an assertionValidity
processing_deltaChecks incremental loads apply correct inserts/updates/deletesConsistency
processing_freshEnsures datasets are produced within the expected SLA windowTimeliness
processing_full_dumpValidates full-refresh loads are complete and consistent with the sourceConsistency
processing_historyValidates historical processing preserves correct versions and time rangesConsistency
reconcilation_e2eReconciles key totals end-to-end across the pipelineIntegrity
reconciliation_countCompares record counts across systems/stagesAccuracy
refferential_integrityChecks foreign-key values resolve to existing parent entitiesIntegrity
structure_discovery_customValidates expected schema characteristics using custom rulesValidity
structure_discovery_datetimeValidates datetime columns follow expected formatsValidity
structure_discovery_emailValidates email fields follow expected structureValidity
structure_discovery_phoneValidates phone fields follow expected structureValidity
value_customApplies custom business rules to flag invalid valuesValidity
value_emptyChecks required fields are not empty/blankCompleteness
value_empty_customChecks emptiness using a custom definitionCompleteness
value_illegalFlags values that violate allowed sets/patternsValidity
value_maximumEnsures values do not exceed a defined maximumAccuracy
value_minimumEnsures values are not below a defined minimumAccuracy
value_missing_customDetects missing required records using custom business keysCompleteness
value_missing_nullDetects NULLs where values are requiredCompleteness
value_out_of_rangeValidates values fall within an expected rangeAccuracy
value_uniqueEnsures specified keys uniquely identify recordsUniqueness