Intraflow Match

The Intraflow Match stage locates matches between similar data records within a single input stream. You can create hierarchical rules based on any fields that have been defined or created in other stages of the dataflow.

This stage allows you to:
  • Create and publish a new match rule to the repository for re-use at a later point in time, which allows you to save significant time. For more information, see Creating a match rule.
  • Use a predefined match rule as-is from Template Rules in Rule Configuration panel. It gives you the flexibility to choose a sophisticated set of pre-defined rules that saves time than defining a rule from scratch.
  • Configure the grouping and sorting options for the match rule in the Settings panel. For more information, see Configuring the grouping and sorting options.

How Intraflow Match functions

Using Group by set by you, the matcher identifies groups of records that might potentially be duplicates of one another. The matcher then proceeds through each record in the group.

  • If the record matches an existing suspect, the record is considered a duplicate of that suspect, assigned a score, collection number, and match record type (Duplicate), and eliminated from the match.
  • If the record matches no existing suspect within the match group, the record becomes a new suspect, in that it is added to the current match group so that it can be matched against by subsequent records.
When the matcher exhausts all records in the current match group, it eliminates all suspects from the match, labeling the match record type as unique and assigning a collection number of 0. Those suspects with a least one duplicate will retain a match record type of suspect and are assigned the same collection number as its matched duplicate record. Finally, when all records within a match group have been written to the output, a new match group is compared.
Note: The default matching method will only compare records that are within the same match group.

The type of matching (Intraflow or Interflow) determines how express key match results translate to candidate match scores.

  • In Intraflow matching, the score a candidate gains as a result of an express key match depends on whether the record to which that candidate matched was a match of some other suspect—express key duplicates of a suspect will always have match scores of 100. In contrast, express key duplicates of another candidate (which was a duplicate of a suspect) will inherit the match score (not necessarily 100) of that candidate.
  • In Interflow matching, a successful express key match always confers a 100 match score onto the candidate.