Data Drift Statistics
Unique value count for all data types
Name | Age |
---|---|
Mark | 35 |
John | 29 |
Ashley | 39 |
Jonas | 33 |
Mark | 35 |
John | 25 |
Mark | 20 |
James | 33 |
Ashley | 25 |
Emma | 20 |
The unique values for "Name" column are: Mark, John, Ashley, Jonas, James, Emma
The cardinality of "Name" is 6
The unique values for "Age" column are: 35, 29, 39, 33, 35, 25, 20
The cardinality of "Name" is 7
Numeric Data
Minimum
For example, the set SALARY={1215, 2000, 5263, 1126, 3687} contains the lowest value "1126". The minimum value of the set SALARY is 1126.
Maximum
For example, the set SALARY={1215, 2000, 5263, 1126, 3687} contains the highest value "5263". The maximum value of the set SALARY is 5263.
Mean
For example, the set SALARY={1215, 2000, 5263, 1126, 3687} contains 5 elements. The mean can be calculated as [1215+2000+5263+1126+3687]/5 which equals 2658.2
Therefore, the mean of the set SALARY is 2658.2
Standard deviation
A higher SD means the data points are highly spread out from the mean, whereas, a lower SD means that the data points are close to the mean.
Textual Data
Minimum length
For example, the set CITY={Sao Paulo, Mexico, Tokyo, Shanghai, Cairo, Mumbai} contains the lowest value for "Cairo" and "Tokyo". The minimum length of the set CITY is 5.
Maximum length
For example, the set CITY={Sao Paulo, Mexico, Tokyo, Shanghai, Cairo, Mumbai} contains the largest value for "Sao Paulo". The maximum length of the set CITY is 9.
Detail Drift
Distribution of value count
For example, consider the below set of data.
Name | Age |
---|---|
Mark | 35 |
John | 29 |
Ashley | 39 |
Jonas | 33 |
Mark | 35 |
John | 25 |
Mark | 20 |
James | 33 |
Ashley | 25 |
Emma | 20 |
The cardinality detail of "Mark" = (3/10)*100 which equals 30 percent
The cardinality detail of "25" = (2/10)*100 which equals 20 percent