๐ Data Contracts
What is a Data Contract (DC)?โ
A data contract defines the agreement between a data producer and consumers. More details about how to write a Data Contract (DC) you can find here open-data-contract-standard.
In X-Automate from the Data Contract specification we can extract and execute DQ/DB tests. Each Data Contract will be imported in the app as a Test Suite with test type DC. Each quality specification will be a test case of that test suite.
How to import and sync DCs?โ
To add Data Contracts in the application you will need to connect a git repository and click Import Data Contracts from repository all files with extension *contract.yaml from repository which respects the OCDS schema will be imported.

After import, you will see the imported files in Data Quality > Data Contracts page. On DC expand there will be available 3 tabs: Test Cases (the Quality Checks extracted), Data Contract (the yaml file view), Execution History. You can execute the DC and get a robot report. Data Contracts can be updated only from git repository, in the app they can be only executed. There is a configurable cron job which syncs DC from git repo or you can go to Git Connection page and run Import Data Contracts from repository to sync updates.

Here is a trimmed down example of a Data Contract (DC). Checkout the comments on the fields. Field quality is specific to X-Automate the rest of the fields can be taken from open-data-contract-standard.
# yaml-language-server: $schema=https://xampublicdocs.blob.core.windows.net/data-contract-schema/data_contract_schema.json
version: 1.0.0
kind: DataContract
apiVersion: v3.1.0
id: dc-suite-with-dq-checks
name: dc-suite-with-dq-checks # This needs to be unique is equivalent to Test Suite name
status: active # DC is active or inactive
servers:
- id: DB-1
host: xam-mysql.mysql.database.azure.com # This must correspond to the host name specified on Database Connection page
schema: CORE # This schema will be used for Quality Tests
type: MYSQL
description: Internal DB For XAM
database: core
port: 3306
server: xam-mysql.mysql.database.azure.com
schema:
- quality: # This is a quality check specification, you can add as many as needed
- name: random-custom-checks-22 # This will be the Test Case Name which needs to be unique
metric: xam_custom # This is a xam_custom Quality Check equivalent to DQ Custom
implementation:
threshold: percentage # Threshold can be percentage or amount
threshold_critical: 98.5 # Value critical for threshold
threshold_warning: 99.7 # Value warning for threshold
count_query: select count(*) from core.employee; # 1. All Records Query
bad_count_query: select count(*) from core.employee; # 2. Bad Records Query
verification_query: select count(*) from core.employee; # 3. Validate Query
- physicalType: table
physicalName: employee # This is the Table Name mandatory for quality of type xam_simple
properties:
- physicalType: int # This is the Column Data Type mandatory for quality of type xam_simple
physicalName: Employee_ID # This is the Column Name mandatory for quality of type xam_simple
required: true # This will create a dq test named: "dq-required-{schema}-{table}-{column}" with MetricName.TEXT_MISSING_VALUES or MetricName.NUMBER_NULL_COUNT based on column data type and both thresholds at 100%
unique: true # This will create a dq test named: "dq-unique-{schema}-{table}-{column}" with MetricName.UNIQUENESS and both thresholds at 100%
quality:
- name: employee-id-is-unique-22 # Qulity name which will be Test Case name (must be unique)
description: Has some random filters # Test Case Description
metric: xam_simple # For metric xam_simple we have a different implementation schema
implementation:
threshold: percentage # Can be percentage or amount same as with DQ checks
threshold_critical: 98.5
threshold_warning: 99.7
metric: uniqueness # Must be choosen carefully based on Column Data Type (some metrics can be applyied only for numeric column data types for example and some can be applied to all column data types)
filters: # Here we can specify filters that will be applied before the metric is executed
- column_name: Country # Column Name
column_type: varchar # Column Data Type
filter_name: not_equals # Filter name - similar to metric must be choosed carefully based on column_type
values:
- romania
- column_name: Employee_ID
column_type: int
filter_name: greater_than
values:
- 0
Metrics for xam_simpleโ
Here are the metric names available for xam_simple. They are equivalent to Quality Simple Checks in the UI, only prefixed to make it easier to identify proper metric based on Column Data Type.
Metrics available for all Column Data Types:
contains(Allowed Values (in))not_contains(Allowed Values (not in))uniqueness(Uniqueness)
Metrics available for datetime Column Data Types:
datetime_value_range(Value Range (between))
Metrics available for numeric Column Data Types:
number_value_range(Value Range (between))number_null_count(Null Count)number_zeroes_count(Zeroes Count)number_equal(Math all below)number_not_equalnumber_greater_thannumber_greater_than_or_equal_tonumber_less_than_or_equal_tonumber_less_than
Metrics available for text Column Data Types:
text_missing_values(Missing Values (= ''))text_length_range(Length Range (between_str))text_like(LIKE)text_not_like(NOT LIKE)text_match_regex(Regular Expression)text_not_match_regex(Regular Expression)
Here are the available filters you can use:
Filters for xam_simpleโ
Same as with metrics, filters must be choosed based on column data type.
Tier 1: Core Logic & Comparison (Daily Use)
equalsnot_equalsandornotis_nullis_not_nullinnot_ingreater_thanless_thangreater_than_or_equal_toless_than_or_equal_tois_trueis_false
Tier 2: Text Search & Common Ranges
ilikelikebetweennot_betweencontainsnot_ilikenot_likeagelength
Tier 3: Math & JSON/Key Operations
additionsubtractionmultiplicationdivisionkey_existsany_key_existsall_keys_existcontained_byis_distinct_from
Tier 4: Full-Text Search & Logic Extensions
to_tsvectorto_tsquerypath_querysimilar_tological_andlogical_orlogical_notxor
Tier 5: Specialized / Geospatial / Bitwise
distanceintersectsabsolute_valuemodulussquare_rootbitwise_andbitwise_orbitwise_notbitwise_xorshift_leftshift_rightabovebelowis_above_or_sameis_below_or_sameis_left_or_sameis_right_or_sameis_same_horizontalis_same_verticalis_parallelis_perpendiculardistance_between_centerscontained_by_or_equalcontains_or_equalsame_asis_distinctis_not_distinct_fromis_not_trueis_not_falseis_unknownis_not_unknownpath_querytextdelete_pathsetweightstripnot_greater_thannot_less_thannot_similar_to