Matching Terminology
- Average Score
- The average match score of all duplicates. The possible values are 0-100, with 0 indicating
a poor match and 100 indicating an exact match.
- Baseline
- The selected match result that will be compared against another match
result.
- Candidate Group
- Suspect and Candidate records grouped together by an ID assigned by
CandidateFinder. The suspect (the first record in the group) is a record
read from an Input source while its candidates are usually records found in
a database using a SQL query.
- Candidate Records
- All non-suspect records in a match group or candidate group.
- Drop
- A decrease in duplicates.
- Detail Match Record
- A single record that corresponds to a record processed by a match stage.
Each record provides information about whether the record was a Suspect,
Unique, or a Duplicate as well as information about its Match Group or
Candidate Group and output collection. Candidate records provide information
on why the input record matched or did not match to its suspect.
- Duplicate Collections
- A duplicate collection consists of a Suspect and its Duplicate records
grouped together by a CollectionNumber. Unique records always belong to
CollectionNumber 0.
- Duplicate Records
- Number of records that match another record within a match group.
- Express Matches
- An express match is made when a suspect and candidate have an exact match on
the contents of a designated field, usually an ExpressMatchKey provided by
the Match Key Generator. If an Express Match is made no further processing
is done to determine if the suspect and candidate are duplicates.
- Input Records
- Order of the records in the matching stage before the matching sort is
performed.
- Interflow Match
- A matching stage that locates matches between similar data records between
two input record streams. The first record stream is a source for suspect
records and the second stream is a source for candidate records.
- Intraflow Match
- A matching stage that locates matches between similar data records within a
single input stream.
- Lift
- An increase in duplicates.
- Match Groups
- (Group By) Records grouped together either by a match key or a sliding
window.
- Match Results
- (or Resource Bundle) Logical grouping of files produced by a stage. This
data is saved for each run of a stage and stored to disk. Subsequent runs
will not overwrite or change the results from a previous run. In MAT, the
bundles are used to provide information about the summary and details
results, as well as settings information.
- Match Results List
- List of match results of a single type that MAT can analyze in the current
analysis session.
- Match Results Type
- Indicates the contents of the match results. MAT uses the match results type
to determine how to use the data.
- Matcher Stage
- A stage on the canvas that performs matching routines. The matcher stages
are Interflow Match, Intraflow Match, and Transactional Match
- Missed Match
- A record that was previously a suspect or duplicate but is now unique.
- New Match
- A record that was previously unique but is now a suspect or duplicate.
- Sliding Window
- The sliding window matching method sequentially fills a predetermined buffer
size called a window with the corresponding amount of data rows. As each row
is added to the window it is compared to each item already contained in the
window.
- Suspect Records
- A driver record that is matched against candidates within a match group or a
candidate group.
- Transactional Match
- A matching stage that matches suspect records against candidate records that
are returned from Candidate Finder or by an external application.
- Unique Records
- A suspect or candidate record that does not match any other records in a
match group. If it is the only record in a match group, a suspect is
automatically unique.