Configuring Data Deduplication
Data deduplication involves scoring a candidate set of records against a master record to identify possible duplicates and then resolving the duplicates into a single record.
- From the Siebel window, click Navigate > Site Map.
- Click Administration - PBBI Group 1 Data Quality Administration.
- Under the PBBI Group 1 Data Quality Administration heading at the top of the page, click Options Manager.
-
Configure the options. When you are done, click
Save Changes. Use Clear Cache to reset the options.
Table 1. Data Deduplication Options Option
Description
Account Deduplication
Specifies whether to identify duplicate account records. If enabled, the Deduplication applet displays when a user attempts to save a record. It shows the potential duplicates and allows the user to merge or delete records.
Prospect Deduplication
Specifies whether deduplication is enabled for prospect records. If enabled, the Deduplication applet displays when a user attempts to save a record. It shows the potential duplicates and allows the user to merge or delete records.
Contact Deduplication
Specifies whether deduplication is enabled for contact records. If enabled, the Deduplication applet displays when a user attempts to save a record. It shows the potential duplicates and allows the user to merge or delete records.
Contact Address Option
Indicates which type of address to use when deduplicating your contact information. You can choose Business Address or Personal Address. A business address is one used for business purposes It is associated with a contact's account. A personal address is associated with a contact.
Deduplication Popup Applet
Indicates whether the Deduplication applet is enabled for interactive deduplication. The Deduplication applet displays the potential duplicates and allows the user to merge or delete records.
Interactive Resolution
Allows you to select how you wish to interact with Siebel to resolve duplicates. You can choose:
- Automatic
- When you select this option, Spectrum™ Technology Platform automatically merges a master record with a candidate duplicate record containing the highest score (probability) of being a duplicate without any interaction.
- Manual
- When you select this option, you will see a list of possible duplicate records. Then you will have the choice to merge the duplicate record with the current record or to merge it with the other listed duplicates.
Note: To avoid encountering any error during automatic merging, the user must press <CTRL-S> to save the record before navigating to another record.Interactive Threshold
Specifies the minimum match score needed to identify a possible duplicate during interactive processing. The higher the value, the closer the match must be. The default is 50.
If the score produced by the match attempt is greater than the entered value (must be between 0 and 100), then the record will be identified as duplicate and a pop-up window will be displayed to the user, allowing the user to choose the action to take. The lower the match threshold, the more match candidates will be displayed.
Batch Import Resolution
Specifies how you want to interact with Siebel to resolve duplicates.
- Automatic
- When you select this option, Spectrum™ Technology Platform automatically merges a master record with a candidate duplicate record containing the highest score (probability) of being a duplicate without any interaction.
- Manual
- When you select this option, you will see a list of possible duplicate records. Then you will have the choice to merge the duplicate record with the current record or to merge it with the other listed duplicates.
If you are using Batch Import Resolution or Batch Update Resolution, see Running a Batch Job for information.
Batch Update Resolution
Specifies how you want to interact with Siebel to resolve duplicates.
- Automatic
- When you select this option, Spectrum™ Technology Platform automatically merges a master record with a candidate duplicate record containing the highest score (probability) of being a duplicate without any interaction.
- Manual
- When you select this option, you will see a list of possible duplicate records. Then you will have the choice to merge the duplicate record with the current record or to merge it with the other listed duplicates.
Batch Import Threshold
Specifies the minimum match score needed to identify a possible duplicate record during EAI processing.
If the score produced by the match attempt is greater than the value you specify (must be between 0 and 100), then the record is considered a match candidate. The record is updated with the candidate record that has the greatest score.
Batch Update Threshold
Specifies the minimum match score needed to identify a duplicate record during batch processing. The higher the value, the closer the match must be. The default is 50.
If the score produced by the comparison of the records is greater than the value you entered (must be between 0 and 100), then the records are considered duplicates.
Account Name Treatment
Determines how the name parser should treat the account name. One of the following:
- Company
- Assumes that all names are companies.
- Analyze
- Assumes that all names are persons.
- Name Parser
- Analyzes the data to determine if it is the name of a company or a person.
Contact Name Treatment
Determines how the name parser should treat the contact name. One of the following:
- Company
- Assumes that all names are companies
- Analyze
- Assumes that all names are persons
- Name Parser
- Analyzes the data to determine if it is the name of a company or a person.
Prospect Name Treatment
Determines how the name parser should treat the prospect name. One of the following:
- Company
- Assumes that all names are companies.
- Analyze
- Assumes that all names are persons.
- Name Parser
- Analyzes the data to determine if it is the name of a company or a person.
Intelligent Merge of Duplicates
Specifies whether to allow empty fields to be replaced with non-empty fields when merging two potential duplicate records. Without Intelligent Merge enabled, you may risk losing phone numbers and e-mail information during merging of records.
For Account Business Component, the following fields are copied to the surviving record:
- Main Phone Number
- Main Fax Number
- Type
- URL
- Account Status
For Contact Business Component, the following fields are copied to the surviving record:
- Fax Phone #
- Work Phone #
- Home Phone #
- Alternate Phone #
- Assistant Phone #
- Cellular Phone #
- Email Address
- Comment
For Prospect, the following fields are copied to the surviving record:
- Fax Phone #
- Work Phone #
- Home Phone #
- Job Title
- Email Address
- Time Zone
- Comment
- Preferred Contact Method
Deduplication Address Option
Indicates which addresses to use for deduplication. One of the following:
- Primary to Primary
- Compare the records using the primary address of the master and candidate records.
- Active to Primary
- Compare the records using the active address of the master record and the primary address of the candidate records.
- Active to All
- Compare the records using the active address of the master record and all the addresses of the candidate records.
- All to All
- Compare the records using all the addresses of the master record and all the addresses of the candidate records.
Survivorship Date Criterion
Indicates the order by which the records are merged with the survivor record.
- Newest
- The newest duplicate record is merged first.
- Oldest
- The oldest duplicate record is merged first.
Survivorship Status Criterion
If enabled, merging an Active record to an Inactive record is not allowed. Only the following merging scenarios are allowed:
- Active to Active
- Inactive to Inactive
- Inactive to Active
Generate Match and Search Key Option
Specifies which characters to use to generate the match key and search key.
- Substring
- Use the first few characters of the record to generate the match key and search key.
- Consonant
- Use just the consonants to generate the match key and search key.