Semantic Analysis

Phone Number Analysis

Select this rule to detect and validate phone numbers and identify phone numbers as fixed line numbers, mobile numbers, or any other type of number. This rule also gives the distribution of the phone numbers by country and region. You need to click the Configure icon to configure this rule to define the default country to use when a phone number does not have a country code. For a list of available default countries, refer to List of Default Countries for Phone Number Analysis.
Note: The default country is selected as the United States.

If you select this rule, the Results page displays an additional Semantic tab showing these details:

Table 1. Information for string field for phone number analysis rule
Label Description
Validity The valid and invalid phone numbers.
Phone Number Types The types of phone numbers, such as mobile, landline, fixed-line, VOIP, Pager, voice mail, or toll-free.
Phone Numbers by Country The country-wise distribution of the detected phone numbers.
Phone Numbers by Region The region-wise distribution of the detected phone numbers.

Email Analysis

This rule detects and validates the email addresses and determines the distribution of email domains in the selected data column. If you select this rule, the Results page displays an additional Semantic tab showing these details:
Table 2. Information for string field for email analysis rule
Label Description
Validity The valid and invalid email addresses.
Domain Distribution The top ten email domains in the selected data column.

Social Security Number (SSN) Analysis

Select this rule to detect and validate social security numbers. If you select this rule, the Results page displays an additional Semantic tab showing the valid and invalid social security numbers.

Credit Card Analysis

Select this rule to detect and validate credit card numbers and identify credit card numbers as JCB, VISA, Diners Club (DINERS), MasterCard, Discover, or American Express (AMEX). If you select this rule, the Results page displays an additional Semantic tab showing these details:
Table 3. Information for string field for credit card analysis rule
Label Description
Validity The valid and invalid credit card numbers.
Credit Card Distribution Category-wise distribution of the detected credit cards.

International Bank Account Number (IBAN) Analysis

Select this rule to detect and validate international bank account numbers. This rule also gives the distribution of International Bank Account Numbers by country. If you select this rule, the Results page displays an additional Semantic tab showing these details:
Table 4. Information for string field for IBAN analysis rule
Label Description
Validity The valid and invalid international bank account numbers.
IBAN Country Distribution Country-wise distribution of the detected international bank account numbers.

Demographic Analysis

Select this rule to detect semantic types, such as title, first name, last name, organization, gender, city, state, country, ISO country code 2 and 3, and nationality. This rule can help find values in incorrect columns, such as city names in a Country column. If you select this rule, the Results page displays an additional Semantic tab showing the detected semantic types and their frequency.

Vehicle Identification Number (VIN) Analysis

Select this rule to detect and validate vehicle identification numbers. This rule also gives the distribution of Vehicle Identification Numbers by country. If you select this rule, the Results page displays an additional Semantic tab showing these details:
Table 5. Information for string field for VIN analysis rule
Label Description
Validity The valid and invalid vehicle identification numbers.
VIN Country Distribution Country-wise distribution of the detected vehicle identification numbers.

Date Analysis

This rule detects and validates dates in string columns. It also identifies date patterns in the columns and their distribution. This analysis can be useful in detecting date entries in erroneous columns, for example in email data. If you select this rule, the Data Profiling Results page displays an additional Date Summary tab for the string columns that have dates. This tab shows these details:
Table 6. Information for string field for date analysis rule
Label Description
Validity The valid and invalid values.
Date Patterns The date patterns detected in the selected columns, their total count, and percentage of that pattern in the data set.