Skip to main content

🧼 Data Quality Checks

Data Quality (DQ) Checks are used to measure the quality of your datasets.
Unlike test cases that return a strict pass/fail, DQ checks tell you how much of the data fails a defined rule.

🧠 Think of a DQ check like a thermometer β€” it doesn’t tell you if you're sick, but it measures your temperature.


πŸ› οΈ Thresholds​

When creating a DQ check, you define thresholds that represent acceptable data quality levels:

  1. ⚠️ Warning Threshold – Quality is acceptable, but potentially risky
  2. ❌ Critical Threshold – Quality has dropped significantly and action is needed

πŸ” Types of Data Quality Checks​

X-AutoMate offers two types of DQ checks:

  1. Simple Checks – Created using a wizard interface for non-technical users
  2. Custom Checks – Free-form queries for advanced scenarios

πŸ§™ Simple Data Quality Checks​

Simple checks help users generate SQL behind the scenes via an easy-to-use UI.
They cover various domains, as shown below:

CategoryNameDescription
UnknownNull CountFinds the number of records that are null (have no value).
NumericMathCompares values using operators (e.g., <, >, =, etc.).
NumericAllowed ValuesChecks if values match a predefined valid list.
NumericValue RangeVerifies that values are within a specified range.
NumericZeroes CountCounts how many values are equal to zero.
NumericUniquenessDetects duplicate values in a column.
DateTimeAllowed ValuesValidates against a predefined set of timestamps.
DateTimeValue RangeChecks whether timestamps fall within a range.
DateTimeNull CountCounts missing timestamps.
DateTimeUniquenessDetects duplicate timestamps.
CategoricalAllowed ValuesChecks string values against an allowed list.
CategoricalUniquenessDetects duplicate strings.
CategoricalMissing ValuesFinds records where string values are empty.
CategoricalLength RangeChecks whether string lengths fall within a range.
CategoricalLIKE / NOT LIKEMatches strings using wildcard patterns.
CategoricalRegular ExpressionValidates strings against regex patterns.

πŸ§ͺ Creating a Simple DQ Check​

Navigate to /data_quality/new to begin.

1️⃣ Select the Table​

Choose the object you want to validate.

Select Object


2️⃣ Add Optional Filters​

Define filters to narrow the data (e.g., only check ProductType = 'Toy').

Filtering


3️⃣ Choose Column & Rule​

Pick the column to validate and specify the condition.

βœ… Make sure your query retrieves valid data (not the bad ones).

For example: To detect customers under 18 years, the correct data is age >= 18.

Column and Rule


4️⃣ Define Thresholds​

Set your warning and critical thresholds.

πŸ’¬ Example:

  • Good Data: β‰₯ 90%
  • Warning: 80%–90%
  • Critical: < 80%

Threshold Settings


5️⃣ Add Metadata​

You can:

  • πŸ”— Link to a requirement
  • πŸ“ Add a DQ description (recommended for clarity)

6️⃣ Submit and Review​

Once saved, you are redirected to /results to review the filtered DQ check.

Results

The check is also listed under /test/case.


πŸ” Operations & Reusability​

  • βœ… DQ checks behave like ETL test cases (see etl.md)
  • πŸ”„ You can convert a simple DQ check to a custom one by clicking β€œConvert to Custom Check”

✨ Custom Data Quality Checks​

A Custom DQ Check lets you define free-form logic using three SQL queries:

  1. πŸ“Š Aggregation Query – Total count of valid data
  2. 🚨 Bad Records Count – Number of failing records
  3. πŸ“‹ Bad Records Query – List of failing records for inspection

Custom DQ

You also provide:

  • 🏷️ A name
  • πŸ“ A description
  • 🎯 Thresholds
  • πŸ”— Requirements and tags (optional)

Once saved, you’re redirected to /results.


🧩 Additional Options​

While editing, you can:

  • πŸ” Reassign DQ checks to new test suites
  • βž• Assign test approaches
  • πŸ”„ Link to more connections or metadata