Using a Match Key Generator Spark Job
-
Create an instance of
AdvanceMatchFactory
, using its static methodgetInstance()
. -
Provide the input and output details for the Match Key Generator job by
creating an instance of
MatchKeyGeneratorDetail
specifying theProcessType
. The instance must use the type SparkProcessType.-
Specify the match key settings to perform the matching by creating and
configuring an instance of
MatchKeySettings
. For more information, see the relevant code sample. -
Create an instance of
MatchKeyGeneratorDetail
by passing an instance of typeJobConfig
and theMatchKeySettings
instance created as the arguments to its constructor.TheJobConfig
parameter must be an instance of type SparkJobConfig. -
Set the details of the input file using the
inputPath
field of theMatchKeyGeneratorDetail
instance.For a text input file, create an instance ofFilePath
with the relevant details of the input file by invoking the appropriate constructor. For an ORC input file, create an instance ofOrcFilePath
with the path of the ORC input file as the argument. -
Set the details of the output file using the
outputPath
field of theMatchKeyGeneratorDetail
instance.For a text output file, create an instance ofFilePath
with the relevant details of the output file by invoking the appropriate constructor. For an ORC output file, create an instance ofOrcFilePath
with the path of the ORC output file as the argument. -
Set the name of the job using the
jobName
field of theMatchKeyGeneratorDetail
instance.
-
Specify the match key settings to perform the matching by creating and
configuring an instance of
-
To create and run the Spark job, use the previously created instance of
AdvanceMatchFactory
to invoke its methodrunSparkJob()
. In this, pass the above instance ofMatchKeyGeneratorDetail
as an argument.TherunSparkJob()
method runs the job and returns aMap
of the reporting counters of the job.