Using a Candidate Finder MapReduce Job
-
Create an instance of
AdvanceMatchFactory
, using its static methodgetInstance()
. -
Provide the input and output details for the Candidate Finder job by creating
an instance of
CandidateFinderDetail
specifying theProcessType
. The instance must use the type MRProcessType.-
Set the values of
hbase_zookeeper_quorum
andhbase_zookeeper_property_clientPort
in the instance MRJobConfig. -
Generate the query for the job by creating an instance of
ComplexSearchQuery
. Within this instance:- Set properties such as
QueryName
,IndexFieldName
, andIndexFieldType
. The search query can beNumeric
,Range
,Contains All
, andContains None
. - Set the search query properties and
connect these using logical operators such as
AND
andOR
.
Note: Each instance ofComplexSearchQuery
can be defined either using a single instance, using a hierarchy of child instances, or nested instances joined using logical operators. See Enum JoinType and Enum Operation. - Set properties such as
-
Set the details of the input file using the
inputPath
field of theCandidateFinderDetail
instance.- For a text input file, create an instance of
FilePath
with the relevant details of the input file by invoking the appropriate constructor. - For an ORC input file, create an instance of
OrcFilePath
with the path of the ORC input file as the argument. - For a Parquet input file, create an instance of ParquetFilePath with the path of the Parquet input file as the argument.
- For a text input file, create an instance of
-
Set the details of the output file using the
outputPath
field of theCandidateFinderDetail
instance.- For a text output file, create an instance of
FilePath
with the relevant details of the output file by invoking the appropriate constructor. - For an ORC output file, create an instance of
OrcFilePath
with the path of the ORC output file as the argument. - For a Parquet output file, create an instance of ParquetFilePath with the path of the Parquet output file as the argument.
- For a text output file, create an instance of
-
Set the name of the job using the
jobName
field of theCandidateFinderDetail
instance. -
Set the
FetchBatchSize
field of theCandidateFinderDetail
instance. The default is 10000. -
Set the
MaximumResults
field of theCandidateFinderDetail
instance. The default is 10. -
Set the
StartingRecord
field of theCandidateFinderDetail
instance. The default is 1.
-
Set the values of
-
To create a MapReduce job, use the previously created instance of
AdvanceMatchFactory
to invoke its methodcreateJob()
. In this, pass the above instance ofCandidateFinderDetail
as an argument.ThecreateJob()
method creates the job and returns aList
of instances ofControlledJob
. -
Run the created job using an instance of
JobControl
. -
To display the reporting counters after successful MapReduce job run, use the
previously created instance of
AdvanceMatchFactory
to invoke its methodgetCounters()
, passing the created job as an argument.