Guidelines to Improve Prediction Accuracy
In order to get the most accurate prediction of address components, your input address strings should adhere to these patterns.
Guidelines for Australia Addresses
- Avoid non-address components
- Presence of non-address components in the input string might lead to wrong prediction. Remove such components before feeding the string for prediction.
- Maintain a sequence in address components
- The address components should be placed in this order: .
- Remove redundant address components
- The input address string should not have repeated address components, such as two different organization names or repetitive name of an organization in one string.
- Do not have merged components in address strings
- Merged address components result in incorrect prediction.
- Avoid addressee name in the string
- Addressee name in the string results in incorrect prediction for the Australia addresses.
- Do not have bracketed "()" address component
- Including any of your address components inside brackets "()" will leave it unparsed.
Limitations for Australia Addresses
- PO Box addresses are not supported.
- Sentence specific addresses (for example, addresses containing "close to", "between",
"nearby") are not supported.
Example: Tourquay Road Close To Butcher Shop Hervey Bay QLD 4655 AUS
- Addresses containing roads with "and" or "&" are not supported.
Example: Corner Farrall Road and O'Connor Road Stratton 6056 AUS
- Addresses with a complex street format (for example, extra street information like tower,
park, and building) are not supported.
Example: Wesfarmers Limited Level 14 Brookfield Place Tower 2 123 St Georges Terrace Perth 6000 AUS
- Unit/street components in character format are not supported.
Example: Ground Floor 46 Charlotte St Brisbane 4000 AUS
- Avoid repeating words for Org, state, or country in addresses (for example, Australia or
QLD).
Example: DOF Subsea Australia Pty Ltd 5th FL 181 St Georges TCE Perth Western Australia 6000 AUS
Guidelines for Canada Addresses
- Avoid non-address components
- Presence of non-address components in the input string might lead to wrong prediction. Remove such components before feeding the string for prediction.
- Maintain a sequence in address components
- The address components should be placed in this order: .
- Remove redundant address components
- The input address string should not have repeated address components, such as two different organization names or repetitive name of an organization in one string.
- Do not have merged components in address strings
- Merged address components result in incorrect prediction.
- Avoid addressee name in the string
- Address name in the string results in incorrect prediction for the Canada addresses.
- Do not have bracketed "()" address component
- Including any of your address components inside brackets "()" will leave it unparsed.
Limitations for Canada Addresses
- Unit or Apartment information is not supported.
- French characters present in the address are not displayed correctly.
Guidelines for France Addresses
- Avoid non-address components
- Presence of non-address components in the input string might lead to wrong prediction. Remove such components before feeding the string for prediction.
- Maintain a sequence in address components
- The address components should be placed in this order: .
- Remove redundant address components
- The input address string should not have repeated address components, such as two different organization names or repetitive name of an organization in one string.
- Do not have merged components in address strings
- Merged address components result in incorrect prediction.
- Avoid addressee name in the string
- Address name in the string results in incorrect prediction for the France addresses.
- Do not have bracketed "()" address component
- Including any of your address components inside brackets "()" will leave it unparsed.
Limitations for France Addresses
- Streets including city name are not supported.
Example: 14 Rue de Maule 78870 Bailly France
- The overseas regions of France are incorrectly parsed (for example, Martinique, Réunion, and Guadeloupe).
Guidelines for German Addresses
- Avoid non-address components
- Presence of non-address components in the input string might lead to wrong prediction. Remove such components before feeding the string for prediction.
- Maintain a sequence in address components
- The address components should be placed in this order: .
- Remove redundant address components
- The input address string should not have repeated address components, such as two different organization names or repetitive name of an organization in one string.
- Ensure address number and street name are included
- Your address string needs to have address number and street name. Missing out these essential address components will impact the accuracy of the result.
- Do not have merged components in address strings
- Merged address components result in incorrect prediction.
- Avoid addressee name in the string
- Addressee name in the string results in incorrect prediction for the German addresses.
- Do not have bracketed "()" address component
- Including any of your address components inside brackets "()" will leave it unparsed.
Guidelines for Spain Addresses
- Avoid non-address components
- Presence of non-address components in the input string might lead to wrong prediction. Remove such components before feeding the string for prediction.
- Maintain a sequence in address components
- The address components should be placed in this order: .
- Remove redundant address components
- The input address string should not have repeated address components, such as two different organization names or repetitive name of an organization in one string.
- Do not have merged components in address strings
- Merged address components result in incorrect prediction.
- Avoid addressee name in the string
- Addressee name in the string results in incorrect prediction for Spain addresses.
- Do not have bracketed "()" address component
- Including any of your address components inside brackets "()" will leave it unparsed.
Limitations for Spain Addresses
- Addresses starting with an abbreviation (for example, PL., Av., BL., C.) addresses are not supported.
- Streets including landmark information (for example, At, Near, Between) are not supported.
Guidelines for United Kingdom Addresses
- Avoid non-address components
- Presence of non-address components in the input string might lead to wrong prediction. Remove such components before feeding the string for prediction.
- Maintain a sequence in address components
- The address components should be placed in this order: .
- Remove redundant address components
- The input address string should not have repeated address components, such as two different organization names or repetitive name of an organization in one string.
- Follow single-token organization names with organization type
- A single-token organization name should be followed by the type of the organization, such as Ltd, Inc, and Reg. In the example below, Ardian is a single-token organization name. In this case, the organization name is not followed by the type "Limited," and the results may be inaccurate.
Limitations for United Kingdom Addresses
An address string of any of these kind is susceptible to getting inaccurately predicted by the address parser. Watch out for these in your address strings.
- Presence of another address component as name of the organization
- If the name of the organization includes any other address component, such as Floor, Flat, and House, the prediction accuracy may be affected.
- Organization name having numbers
- If an organization name has numbers, it is susceptible to getting erroneously predicted.
Guidelines for United States Addresses
- Avoid non-address components
- Presence of non-address components in the input string might lead to wrong prediction. Remove such components before feeding the string for prediction.
- Maintain a sequence in address components
- The address components should be placed in this order: .
- Remove redundant address components
- The input address string should not have repeated address components, such as two different organization names or repetitive name of an organization in one string.
- Do not have merged components in address strings
- Merged address components result in incorrect prediction.
- Avoid addressee name in the string
- Addressee name in the string results in incorrect prediction for the United States addresses.
- Do not have bracketed "()" address component
- Including any of your address components inside brackets "()" will leave it unparsed.
Limitations for United States Addresses
- PO Box addresses are not supported.
- In Care of (C/O) addresses are not supported.
- If AddressNumber is missing, StreetNumber may be returned as AddressNumber (only in cases of numeric digits without superscripts).
- Direction may be returned in StateProvince for a few defined addresses (especially in cases where Direction is comprised of two letters).