Who should use the SDK?
The Big Data Quality SDK is intended for:
- Customers who want to do data quality on the data residing on Hadoop.
- Hadoop developers familiar with MapReduce or Spark programming who wish to create a solution around a certain use case.
- Hadoop developers who want to perform data cleansing, data enriching, data deduplication, and data consolidation operations over existing data.
- Hive users who are not familiar with the complexities of MapReduce or Spark but are comfortable with Hive Query Language (HQL), which is syntactically similar to SQL.