Data & Address Quality for Big Data SDK Guide

Content
Search Results
Loading, please wait ...

Loading

  • Welcome
  • Getting Started
    • Introduction
    • Who should use the SDK?
    • Workflow
    • Modules and Jobs
    • Reports
  • Installing the SDK
    • System Requirements
    • Operating System Updates
    • Installer Inclusions
    • Installing SDK on Windows
    • Installing SDK on Linux
      • Running Acushare Service
  • Using Reference Data
    • Reference Data Overview
      • Scripts to install reference data
      • Placement and Usage of Reference Data
    • Extracting Reference Data for DNM, UNM Jobs
      • User Defined Reference Data
    • Extracting Reference Data for GAM Jobs
    • Extracting Reference Data for UAM Jobs
      • Extraction through interactive utility
      • Extraction using silent script
  • The Java API
    • Components of the SDK Java API
    • Using the Software Development Kit
      • Creating a Java Application
    • Common API Entities
      • ConjoinedRule
      • ConsolidationCondition
        • ConsolidationRule
        • ConsolidationAction
      • FilePath
      • JobConfig<T extends ProcessType>
        • MRJobConfig
        • SparkJobConfig
      • JobDetail<T extends ProcessType>
      • JobFactory
      • JobPath
      • OrcFilePath
      • ParquetFilePath
      • ProcessType
        • MRProcessType
        • SparkProcessType
      • ReferenceDataPath
      • ReportManager
      • SimpleRule
      • Exceptions
        • JobException
    • Advanced Matching Module Jobs
      • Common Module API
        • AdvanceMatchDetail<T extends ProcessType>
        • AdvanceMatchFactory
        • GroupbyOption<T extends ProcessType>
          • GroupbyMROption
          • GroupbySparkOption
        • MatchKeySettings
        • MatchRule
          • ChildMatchRule
          • ParentMatchRule
      • Special Scenarios
      • Best of Breed Job
        • API Entities
          • BestOfBreedConfiguration
          • BestofBreedDetail
        • Input Parameters
        • Output Columns
        • Using a Best of Breed MapReduce Job
        • Using a Best of Breed Spark Job
      • Candidate Finder Job
        • API Entities
          • CandidateFinderDetail
        • Input Parameters
        • Output Columns
        • Using a Candidate Finder MapReduce Job
        • Using a Candidate Finder Spark Job
      • Duplicate Synchronization Job
        • API Entities
          • DuplicateSynchronizationConfiguration
          • DuplicateSyncDetail
        • Input Parameters
        • Output Columns
        • Using a Duplicate Synchronization MapReduce Job
        • Using a Duplicate Synchronization Spark Job
      • Filter Job
        • API Entities
          • FilterConfiguration
          • FilterDetail
        • Input Parameters
        • Output Columns
        • Using a Filter MapReduce Job
        • Using a Filter Spark Job
      • Interflow Job
        • API Entities
          • InterMatchDetail
          • InterMatchComparisonOption
        • Input Parameters
        • Output Columns
        • Using an Interflow Match MapReduce Job
        • Using an Interflow Match Spark Job
      • Intraflow Job
        • API Entities
          • IntraMatchDetail
        • Input Parameters
        • Output Columns
        • Using an Intraflow Match MapReduce Job
        • Using an Intraflow Match Spark Job
      • Match Key Generator job
        • API Entities
          • MatchKeyGeneratorDetail
        • Input Parameters
        • Output Columns
        • Using a Match Key Generator MapReduce Job
        • Using a Match Key Generator Spark Job
      • Transactional Match Job
        • API Entities
          • TransactionalMatchDetail
        • Input Parameters
        • Output Columns
        • Using a Transactional Match MapReduce Job
        • Using a Transactional Match Spark Job
    • Data Integration Module Jobs
      • Common Module API
        • DataIntegrationFactory
      • Custom Groovy Script
        • Custom Groovy Script Job
        • API Entities
          • CustomGroovyScriptConfiguration
          • CustomGroovyScriptDetail
        • Input Parameters
        • Output Columns
        • Using a Groovy Script MapReduce Job
        • Using a Groovy Script Spark Job
      • Joiner
        • Joiner Job
        • API Entities
          • JoinDetail
        • Input Parameters
        • Output Columns
        • Using a Joiner MapReduce Job
        • Using a Joiner Spark Job
    • Data Normalization Module Jobs
      • Common Module API
        • DataNormalizationDetail<T extends ProcessType>
        • DataNormalizationFactory
      • Advanced Transformer
        • Advanced Transformer Job
        • API Entities
          • AbstractAdvancedTransformerRules
          • AdvancedTransformerDetail
          • AdvancedTransformerConfiguration
          • RegularExpressionExtraction
          • RegularExpressionGroupItem
          • TableDataExtraction
        • Input Parameters
        • Output Columns
        • Using an Advanced Transformer MapReduce Job
        • Using an Advanced Transformer Spark Job
      • Open Parser
        • Open Parser Job
        • API Entities
          • OpenParserDetail
          • OpenParserConfiguration
        • Input Parameters
        • Output Columns
        • Using an Open Parser MapReduce Job
        • Using an Open Parser Spark Job
      • Table Lookup
        • Table Lookup Job
        • API Entities
          • AbstractTableLookupRule
          • Categorize
          • Identify
          • Standardize
          • TableLookupDetail
          • TableLookupConfiguration
        • Input Parameters
        • Output Columns
        • Using a Table Lookup MapReduce Job
        • Using a Table Lookup Spark Job
    • Global Addressing Module Jobs
      • Global Address Validation
      • API Entities
        • AddressValidationDetail<T extends ProcessType>
        • AddressValidationEngineConfiguration
        • AddressValidationFactory
        • ProductDatabaseInfo
        • AddressValidationInputOption
      • Input Parameters
      • Output Columns
      • Using a Global Address Validation MapReduce Job
      • Using a Global Address Validation Spark Job
    • Universal Addressing Module Jobs
      • Common Module APIs
        • UniversalAddressingDetail<T extends ProcessType>
        • UniversalAddressingFactory
      • Validate Address
        • Validate Address
        • API Entities
          • UAMAddressingDetail<T extends ProcessType>
          • UniversalAddressEngineConfiguration
          • UAMAddressingFactory
          • UniversalAddressGeneralConfiguration
          • UniversalAddressValidateInputConfiguration
        • Input Parameters
        • Output Columns
        • Using a Validate Address MapReduce Job
        • Using a Validate Address Spark Job
      • Validate Address Global
        • Validate Address Global
        • API Entities
          • GlobalAddressingDetail<T extends ProcessType>
          • GlobalAddressingEngineConfiguration
          • GlobalAddressingFactory
          • GlobalAddressingGeneralConfiguration
          • GlobalAddressingEngineConfiguration
        • Input Parameters
        • Output Columns
        • Using a Validate Address Global MapReduce Job
        • Using a Validate Address Global Spark Job
      • Validate Address Loqate
        • Validate Address Loqate
        • API Entities
          • LoqateAddressingDetail<T extends ProcessType>
          • LoqateAddressingEngineConfiguration
          • LoqateAddressingFactory
          • LoqateAddressingGeneralConfiguration
          • LoqateAddressingValidateConfiguration
        • Input Parameters
        • Output Columns
        • Using a Validate Address Loqate MapReduce Job
        • Using a Validate Address Loqate Spark Job
    • Universal Name Module Jobs
      • Common Module API
        • UniversalNameDetail<T extends ProcessType>
        • UniversalNameFactory
      • Open Name Parser
        • API Entities
          • OpenNameParserDetail
          • OpenNameParserConfiguration
        • Input Parameters
        • Output Columns
        • Using an Open Name Parser MapReduce Job
        • Using an Open Name Parser Spark Job
  • XML Configuration Files
    • Sample Configuration Files
      • Using Configuration Property Files
    • Advanced Matching Module
      • Best of Breed
      • Candidate Finder
      • Duplicate Synchronization
        • Configuration Files
      • Filter
        • Configuration Files
      • Interflow Match
        • Configuration Files
      • Intraflow Match
        • Configuration Files
      • Match Key Generator
        • Configuration Files
      • Transactional Match
        • Configuration Files
    • Data Integration Module
      • Custom Groovy Script
        • Configuration Files
      • Joiner
        • Configuration Files
    • Data Normalization Module
      • Advanced Transformer
      • Table Lookup
      • Open Parser
    • Global Addressing Module
      • Global Address Validation
        • Supported Countries
        • Configuration Files - Address Validation
    • Universal Addressing Module
      • Validate Address
        • Configuration Files
      • Validate Address Global
        • Configuration Files
      • Validate Address Loqate
        • Configuration Files
    • Universal Name Module
      • OpenNameParser
        • Configuration Files
  • Hive User-Defined Functions
    • Introduction
      • Components of a Hive Function
      • Using a Hive UDF
    • Advanced Matching Module Functions
      • Using a Hive UDF of Advance Matching Module
      • Best of Breed
        • Sample Hive Script
      • Candidate Finder
        • Sample Hive Script
      • Duplicate Synchronization
        • Sample Hive Script
      • Filter
        • Sample Hive Script
      • Interflow Match
        • Sample Hive Script
      • Intraflow Match
        • Sample Hive Script
      • Match Key Generator
        • Sample Hive Script
      • Transactional Match
        • Sample Hive Script
    • Data Integration Module Functions
      • Using a Custom Groovy Script Hive UDF
      • Sample Hive Script
    • Data Normalization Module Functions
      • Using a Hive UDF of Data Normalization Module
      • Advanced Transformer
        • Sample Hive Script
      • Open Parser
        • Sample Hive Script
      • Table Lookup
        • Sample Hive Script
    • Global Addressing Module Functions
      • Using a Hive UDF of Global Addressing Module
      • Global Address Validation
        • Sample Hive Scripts - Addressing Validation
        • Sample Hive Scripts - USA Addressing Validation
    • Universal Addressing Module Functions
      • Using a Hive UDF of Universal Addressing Module
      • Validate Address
        • Sample Hive Script
      • Validate Address Global
        • Sample Hive Scripts
      • Validate Address Loqate
        • Sample Hive Script
    • Universal Name Module Functions
      • Using a Hive UDF of Universal Name Module
      • Open Name Parser
        • Sample Hive Script
  • Reporting Counters
    • Reporting Counters
    • Advanced Matching Module
      • Interflow Match Report
      • Intraflow Match Report
      • Transactional Match Report
    • Global Addressing Module
      • Global Address Validation Report
    • Universal Addressing Module
      • Validate Address Reports
        • Address Validation Summary Report
        • CASS 3553 Summary Report
        • CASS Detail Report
      • Validate Address Global Report
      • Validate Address Loqate Report
    • Universal Name Module
      • Open Name Parser Report
  • Appendix
    • Exceptions
      • Exception Messages
    • Enums
      • Common Enumerations
      • Universal Addressing Enumerations
    • ISO Country Codes and Module Support
      • ISO Country Codes and Coder Support