Data Quality
Profiling
Data profiling allows you to examine data and collect statistics or informative summaries about that data. The results of data profiling can help you:
- To understand the data as the first critical step of any data engineering project.
- To find data quality rules and requirements that will support a more thorough data quality assessment in a later step.
InfoLink supports the following data profiling operations:
- Column Frequency Analysis
- Columns Profile
- Join Analysis
- Reference Discovery
There is a corresponding subsection in the Specifications section of the navigation tree for each type of data profiling. You first have to create a data profile specification of the corresponding type. Then you can:
- Execute it and see the result by right clicking on the specification name,
- Or create an operation in a scenario that executes the specification.
Column Frequency Analysis
Column frequency analysis allows you to get the frequency distribution of values in a column. The result is a table with two columns: the first column contains all unique values of the input column and the second column contains the corresponding count - how many times a value appears in the input table.
Create a Column Frequency Profile Specification
To create a column frequency profile specification:
- Right click on Specifications -> Profiling -> Column Frequency in the navigation tree and select Create profile.
- Enter the profile name and click the Create button. The current window will show the column frequency specification parameters.
- Enter the parameters: type or select source,space, table, and the column for which you want to compute the frequency distribution, then enter the name of the target table where the result will be stored.
Execute a Column Frequency Profile Specification
To execute a column frequency profile specification:
- Right click on the Column frequency profile specification, select Execute, and wait for completion.
- Right click on the column frequency profile specification and select View result.
Alternatively you can create a ColumnFrequencyProfile operation in a scenario, providing the name of the specification as a parameter, and execute it.
Delete Column Frequency Profile Specification
To delete a column frequency profile specification:
- Right click on the specification and select Delete.