Data Drift Statistics
Unique value count for all data types
| Name | Age |
|---|---|
| Mark | 35 |
| John | 29 |
| Ashley | 39 |
| Jonas | 33 |
| Mark | 35 |
| John | 25 |
| Mark | 20 |
| James | 33 |
| Ashley | 25 |
| Emma | 20 |
The unique values for "Name" column are: Mark, John, Ashley, Jonas, James, Emma
The cardinality of "Name" is 6
The unique values for "Age" column are: 35, 29, 39, 33, 35, 25, 20
The cardinality of "Name" is 7
Numeric Data
Minimum
For example, the set SALARY={1215, 2000, 5263, 1126, 3687} contains the lowest value "1126". The minimum value of the set SALARY is 1126.
Maximum
For example, the set SALARY={1215, 2000, 5263, 1126, 3687} contains the highest value "5263". The maximum value of the set SALARY is 5263.
Mean
For example, the set SALARY={1215, 2000, 5263, 1126, 3687} contains 5 elements. The mean can be calculated as [1215+2000+5263+1126+3687]/5 which equals 2658.2
Therefore, the mean of the set SALARY is 2658.2
Standard deviation
A higher SD means the data points are highly spread out from the mean, whereas, a lower SD means that the data points are close to the mean.
Textual Data
Minimum length
For example, the set CITY={Sao Paulo, Mexico, Tokyo, Shanghai, Cairo, Mumbai} contains the lowest value for "Cairo" and "Tokyo". The minimum length of the set CITY is 5.
Maximum length
For example, the set CITY={Sao Paulo, Mexico, Tokyo, Shanghai, Cairo, Mumbai} contains the largest value for "Sao Paulo". The maximum length of the set CITY is 9.
Detail Drift
Distribution of value count
For example, consider the below set of data.
| Name | Age |
|---|---|
| Mark | 35 |
| John | 29 |
| Ashley | 39 |
| Jonas | 33 |
| Mark | 35 |
| John | 25 |
| Mark | 20 |
| James | 33 |
| Ashley | 25 |
| Emma | 20 |
The cardinality detail of "Mark" = (3/10)*100 which equals 30 percent
The cardinality detail of "25" = (2/10)*100 which equals 20 percent