Validation
Validation allows to check correctness of data by specifying and evaluating various types of constraints.
To implement Validation you first create a validation specification. Standardization specifications are located in theStandardization subsection of the Specifications section in the navigation tree. After creating a Standardization specification you can:
- Execute it and see the result by right clicking on the specification name;
- Or create the Validate operation in a scenario that executes the specification.
A validation specifications contains Source, Space, and Table fields that define the table for which you want to specify constraints. You can specify the following types of constraints:
- Unique constraint or primary key constraint.
- Check constraints.
- Aggregate constraints.
Unique Constraint
Unique constraint consists of a list of columns whose values must be unique among all records in the table. If you check the primary key checkbox, you add the additional requirement that the value of the listed columns cannot be null.
Check Constraints
You can specify any number of check constraints. A check constraint is an expression in the data source supported language (e.g. SQL) that must evaluate to true for all records of the table. Check constraints is a general mechanism that allows you to express many types of data validation constraints. For example, a NOT NULL constraint can be expressed as: is not null (in SQL).
Aggregate Constraints
You can specify any number of aggregate constraints. An aggregate constraint consists of the following components (all the components are specified in a language supported by the data source):
- Filter - a predicate expression that allows computation of the aggregate only on a subset of the table records.
- Aggregate - an aggregate expression computed over the table records.
- Condition - a predicate expression on the result of the aggregate expression that must evaluate to true. Use $a to refer to the result of the aggregate expression in the condition.
For example, the following constrains specifiers (in SQL) that the number of active customers must be in a range of 1500 to 2000.
Filter |
Aggregate |
Condition |
status = ‘active’ |
count(*) |
$a between 1500 and 2000 |