Options

  1. In the Load match rule field, select one of the predefined match rules which you can either use as-is or modify to suit your needs. If you want to create a new match rule without using one of the predefined match rules as a starting point, click New. You can only have one custom rule in a dataflow.
    Note: The Dataflow Options feature in Enterprise Designer enables the match rule to be exposed for configuration at runtime.
  2. Click Group By to select a field to use for grouping records in the match queue. Intraflow Match only attempts to match records against other records in the same match queue.
  3. Select the Sort box to perform a pre-match sort of your input based on the field selected in the Group By field.
  4. Click Advanced to specify additional sort performance options.
    In memory record limit
    Specifies the maximum number of data rows a sorter will hold in memory before it starts paging to disk. By default, a sort of 10,000 records or less will be done in memory and a sort of more than 10,000 records will be performed as a disk sort. The maximum limit is 100,000 records. Typically an in-memory sort is much faster than a disk sort, so this value should be set high enough so that most of the sorts will be in-memory sorts and only large sets will be written to disk.
    Note: Be careful in environments where there are jobs running concurrently because increasing the In memory record limit setting increases the likelihood of running out of memory.
    Maximum number of temporary files
    Specifies the maximum number of temporary files that may be used by a sort process. Using a larger number of temporary files can result in better performance. However, the optimal number is highly dependent on the configuration of the server running Spectrum Technology Platform. You should experiment with different settings, observing the effect on performance of using more or fewer temporary files. To calculate the approximate number of temporary files that may be needed, use this equation:
    (NumberOfRecords × 2) ÷ InMemoryRecordLimit = NumberOfTempFilesN
    Note: The maximum number of temporary files cannot be more than 1,000.
    Enable compression
    Specifies that temporary files are compressed when they are written to disk.
    Note: The optimal sort performance settings depends on your server's hardware configuration. You can use this equation as a general guideline to produce good sort performance: (InMemoryRecordLimit × MaxNumberOfTempFiles ÷ 2) >= TotalNumberOfRecords
  5. Click Express Match On to perform an initial comparison of express key values to determine whether two records are considered a match.
    You can generate an express key as part of generating a match key through MatchKeyGenerator. See Match Key Generator for more information.
  6. In the Initial Collection Number text box, specify the starting number to assign to the collection number field for duplicate records.

    The collection number identifies each duplicate record in a match queue. Unique records are assigned a collection number of 0. Each duplicate record is assigned a collection number starting with the value specified in the Initial Collection Number text box.

  7. Click Sliding Window to enable this matching method. For more information about Sliding Window, see Sliding Window Matching Method
  8. Click Generate Data for Analysis to generate match results. For more information, see Analyzing Match Results.
  9. Assign collection number 0 to unique records, checked by default, will assign zeroes as collection numbers to unique records. Uncheck this option to generate collection numbers other than zero for unique records. The unique record collection numbers will be in sequence with any other collection numbers. For example, if your matching dataflow finds five records and the first three records are unique, the collection numbers would be assigned as shown in the first group below. If your matching dataflow finds five records and the last two are unique, the collection numbers would be assigned as shown in the second group below.
    OptionDescription
    Collection Number Record Type
    1 Unique
    2 Unique
    3 Unique
    4 Duplicate/Suspect
    4 Duplicate/Suspect
       
    Collection Number Record Type
    1 Duplicate/Suspect
    1 Duplicate/Suspect
    2 Unique
    3 Unique
    4 Unique
    If you leave this box checked, any unique records found in your dataflow will be assigned a collection number of zero by default.
  10. Select the Return match rule name option to include the selected match rule name in the stage output.
  11. Select Return detailed match information if you want detailed match information to be displayed as an output for your match rule. For more information about the output fields, see Output.
    Note: If you enable this field, it will hinder the overall stage performance.
  12. For information about modifying the other options, see Building a Match Rule.
  13. Click Evaluate to evaluate how a suspect record scored against candidate records. For more information, see Interflow Match.