Using a Validate Address Spark Job

Attention: Before creating and running the first Validate Address job, ensure the Acushare service is running. For steps, see Running Acushare Service.
  1. Create an instance of UAMAddressingFactory, using its static method getInstance().
  2. Provide the input and output details for the Validate Address job by creating an instance of UAMAddressingDetail specifying the ProcessType. The instance must use the type SparkProcessType. For this, the steps are:
    1. To configure the input settings for the job, create an instance of UniversalAddressValidateInputConfiguration.
      Set the values of the various required fields of this instance, using the enums Enum PreferredCity, Enum CasingType, Enum CityNameFormat, Enum OutputCountryFormat, Enum StandardAddressFormat, Enum StandardAddressPMBLine, Enum StreetMatchingStrictness, Enum FirmMatchingStrictness, Enum DirectionalMatchingStrictness, Enum DualAddressLogic, and Enum DPVSuccessStatusCondition where applicable.
      Important: To run Validate Address in the CASS Certified™ mode, set the fields outputReport3553, outputCASSDetail, and outputReportSummary of this instance to true. The CASS reports contain valid content only when the job is run in the CASS Certified™ mode. Else, blank report PDFs are generated.
    2. Set the details of the Reference Data path by creating an instance of ReferenceDataPath. See Enum ReferenceDataPathLocation.
    3. To configure the various job run settings, create an instance of UAMUSAddressingEngineConfiguration by passing the ReferenceDataPath instance created above, and the COBOL Runtime path and modules directory path as String values, as arguments to its constructor.
      Once the UAMUSAddressingEngineConfiguration instance is created, set the values for its various required fields.
    4. To configure JVM settings, create an instance of UniversalAddressGeneralConfiguration.
    5. Create an instance of UAMAddressingDetail, by passing an instance of type JobConfig, and the instances of UAMUSAddressingEngineConfiguration, UniversalAddressGeneralConfiguration, and UniversalAddressValidateInputConfiguration created above as the arguments to its constructor.
      The JobConfig parameter must be an instance of type SparkJobConfig.
      1. Set the details of the input file using the inputPath field of the UAMAddressingDetail instance.
        Note:
        • For a text input file, create an instance of FilePath with the relevant details of the input file by invoking the appropriate constructor.
        • For an ORC input file, create an instance of OrcFilePath with the path of the ORC input file as the argument.
        • For a parquet input file, create an instance of ParquetFilePath with the path of the parquet input file as the argument.
      2. Set the details of the output file using the outputPath field of the UAMAddressingDetail instance.
        Note:
        • For a text output file, create an instance of FilePath with the relevant details of the output file by invoking the appropriate constructor.
        • For an ORC output file, create an instance of OrcFilePath with the path of the ORC output file as the argument.
        • For a parquet output file, create an instance of ParquetFilePath with the path of the parquet output file as the argument.

      3. Set the name of the job using the jobName field of the UAMAddressingDetail instance.
      4. Set the compressOutput flag of the UAMAddressingDetail instance to true to compress the output of the job.
  3. To create and run the Spark job, use the previously created instance of UAMAddressingFactory to invoke its method runSparkJob(). In this, pass the above instance of UAMAddressingDetail as an argument.
    The runSparkJob() method runs the job and returns a Map of the reporting counters of the job.
  4. To display the reporting counters post a successful job run, use the previously created instance of UAMAddressingFactory to invoke its method getCounters(), passing the created job as an argument.
    A Map of counters is received.
  5. To generate the CASS reports after a successful job run, use the previously created instance of UAMAddressingFactory to invoke the method generateCASSReport(). You can invoke any of the overloaded versions of the method generateCASSReport().
    Depending on which generateCASSReport() method signature is used, pass as arguments the Map of reporting counters derived in the previous step, the jobName, the path where the generated CASS report must be stored, and the required reportType to be created.
    The path must be on the cluster or client location depending on whether the SDK job is running in a cluster environment or on your client machine, respectively.
    Note: If the path is not specified, the new CASS report is placed in the current working directory.

    The reportType parameter must have values from the Enum UAMCASSReportType. You can specify one or more report types in this parameter.