Configuring Data Deduplication

Data deduplication involves scoring a candidate set of records against a master record to identify possible duplicates and then resolving the duplicates into a single record.

  1. From the Siebel window, click Navigate > Site Map.
  2. Click Administration - PBBI Group 1 Data Quality Administration.
  3. Under the PBBI Group 1 Data Quality Administration heading at the top of the page, click Options Manager.
  4. Configure the options. When you are done, click Save Changes. Use Clear Cache to reset the options.
    Table 1. Data Deduplication Options

    Option

    Description

    Account Deduplication

    Specifies whether to identify duplicate account records. If enabled, the Deduplication applet displays when a user attempts to save a record. It shows the potential duplicates and allows the user to merge or delete records.

    Prospect Deduplication

    Specifies whether deduplication is enabled for prospect records. If enabled, the Deduplication applet displays when a user attempts to save a record. It shows the potential duplicates and allows the user to merge or delete records.

    Contact Deduplication

    Specifies whether deduplication is enabled for contact records. If enabled, the Deduplication applet displays when a user attempts to save a record. It shows the potential duplicates and allows the user to merge or delete records.

    Contact Address Option

    Indicates which type of address to use when deduplicating your contact information. You can choose Business Address or Personal Address. A business address is one used for business purposes It is associated with a contact's account. A personal address is associated with a contact.

    Deduplication Popup Applet

    Indicates whether the Deduplication applet is enabled for interactive deduplication. The Deduplication applet displays the potential duplicates and allows the user to merge or delete records.

    Interactive Resolution

    Allows you to select how you wish to interact with Siebel to resolve duplicates. You can choose:

    Automatic
    When you select this option, Spectrum™ Technology Platform automatically merges a master record with a candidate duplicate record containing the highest score (probability) of being a duplicate without any interaction.
    Manual
    When you select this option, you will see a list of possible duplicate records. Then you will have the choice to merge the duplicate record with the current record or to merge it with the other listed duplicates.
    Note: To avoid encountering any error during automatic merging, the user must press <CTRL-S> to save the record before navigating to another record.

    Interactive Threshold

    Specifies the minimum match score needed to identify a possible duplicate during interactive processing. The higher the value, the closer the match must be. The default is 50.

    If the score produced by the match attempt is greater than the entered value (must be between 0 and 100), then the record will be identified as duplicate and a pop-up window will be displayed to the user, allowing the user to choose the action to take. The lower the match threshold, the more match candidates will be displayed.

    Batch Import Resolution

    Specifies how you want to interact with Siebel to resolve duplicates.

    Automatic
    When you select this option, Spectrum™ Technology Platform automatically merges a master record with a candidate duplicate record containing the highest score (probability) of being a duplicate without any interaction.
    Manual
    When you select this option, you will see a list of possible duplicate records. Then you will have the choice to merge the duplicate record with the current record or to merge it with the other listed duplicates.

    If you are using Batch Import Resolution or Batch Update Resolution, see Running a Batch Job for information.

    Batch Update Resolution

    Specifies how you want to interact with Siebel to resolve duplicates.

    Automatic
    When you select this option, Spectrum™ Technology Platform automatically merges a master record with a candidate duplicate record containing the highest score (probability) of being a duplicate without any interaction.
    Manual
    When you select this option, you will see a list of possible duplicate records. Then you will have the choice to merge the duplicate record with the current record or to merge it with the other listed duplicates.

    Batch Import Threshold

    Specifies the minimum match score needed to identify a possible duplicate record during EAI processing.

    If the score produced by the match attempt is greater than the value you specify (must be between 0 and 100), then the record is considered a match candidate. The record is updated with the candidate record that has the greatest score.

    Batch Update Threshold

    Specifies the minimum match score needed to identify a duplicate record during batch processing. The higher the value, the closer the match must be. The default is 50.

    If the score produced by the comparison of the records is greater than the value you entered (must be between 0 and 100), then the records are considered duplicates.

    Account Name Treatment

    Determines how the name parser should treat the account name. One of the following:

    Company
    Assumes that all names are companies.
    Analyze
    Assumes that all names are persons.
    Name Parser
    Analyzes the data to determine if it is the name of a company or a person.

    Contact Name Treatment

    Determines how the name parser should treat the contact name. One of the following:

    Company
    Assumes that all names are companies
    Analyze
    Assumes that all names are persons
    Name Parser
    Analyzes the data to determine if it is the name of a company or a person.

    Prospect Name Treatment

    Determines how the name parser should treat the prospect name. One of the following:

    Company
    Assumes that all names are companies.
    Analyze
    Assumes that all names are persons.
    Name Parser
    Analyzes the data to determine if it is the name of a company or a person.

    Intelligent Merge of Duplicates

    Specifies whether to allow empty fields to be replaced with non-empty fields when merging two potential duplicate records. Without Intelligent Merge enabled, you may risk losing phone numbers and e-mail information during merging of records.

    For Account Business Component, the following fields are copied to the surviving record:

    • Main Phone Number
    • Main Fax Number
    • Type
    • URL
    • Account Status

    For Contact Business Component, the following fields are copied to the surviving record:

    • Fax Phone #
    • Work Phone #
    • Home Phone #
    • Alternate Phone #
    • Assistant Phone #
    • Cellular Phone #
    • Email Address
    • Comment

    For Prospect, the following fields are copied to the surviving record:

    • Fax Phone #
    • Work Phone #
    • Home Phone #
    • Job Title
    • Email Address
    • Time Zone
    • Comment
    • Preferred Contact Method

    Deduplication Address Option

    Indicates which addresses to use for deduplication. One of the following:

    Primary to Primary
    Compare the records using the primary address of the master and candidate records.
    Active to Primary
    Compare the records using the active address of the master record and the primary address of the candidate records.
    Active to All
    Compare the records using the active address of the master record and all the addresses of the candidate records.
    All to All
    Compare the records using all the addresses of the master record and all the addresses of the candidate records.

    Survivorship Date Criterion

    Indicates the order by which the records are merged with the survivor record.

    Newest
    The newest duplicate record is merged first.
    Oldest
    The oldest duplicate record is merged first.

    Survivorship Status Criterion

    If enabled, merging an Active record to an Inactive record is not allowed. Only the following merging scenarios are allowed:

    • Active to Active
    • Inactive to Inactive
    • Inactive to Active

    Generate Match and Search Key Option

    Specifies which characters to use to generate the match key and search key.

    Substring
    Use the first few characters of the record to generate the match key and search key.
    Consonant
    Use just the consonants to generate the match key and search key.