Reference Data Overview

The Pitney Bowes Reference Data defines a set of permissible values to be used by other data fields in your system to ensure data quality. It enhances data validity, accuracy and consistency. It enables you to extract more value from your data and obtain trusted data from Big Data system.

For example, if you use the Reference Data with Data Normalization Module, you can establish a single customer identity across the enterprise. A well-defined customer information is the first step towards improving operational efficiency.

Important: For the Validate Address and Vadidate Address Global jobs, the Reference data must be placed on all the data nodes of Hadoop cluster. For the Validate Address Loqate job, it must be placed at one node and that further needs to be mounted to all other datanodes.

Installation Directory Structure

In the SDK installation directory, the Utilities/dbloader directory contains the child folders:

dataquality

Contains JAR and scripts to install the Reference Data for:

Data Normalization Module
Universal Name Module
Note: For more information, see Using Reference Data: Data Normalization Module and Universal Name Module.

aq

Contains:

The scripts/server/installdb_unc.sh script to install the Reference Data. You must run this script to install or extract the data.
runtime folder containing Acushare service set-up information for Universal Addressing Module's Validate Address job.

Note: For more information, see Using Reference Data: Universal Addressing Module.