Optimizing Geocoding
Geocoding stages provide the best performance when the input records are sorted by postal code. This is because of the way the reference data is loaded in memory. Sorted input will sometimes perform several times faster than unsorted input. Since there will be some records that do not contain data in the postal code field, the following sort order is recommended:
- PostalCode
- StateProvince
- City
You can also optimize geocoding stages by experimenting with different match modes. The match mode controls how the geocoding stage determines if a geocoding result is a close match. Consider consider setting the match mode to the Relaxed setting and seeing if the results meet your requirements. The Relaxed mode will generally perform better than other match modes.
Optimizing Geocode US Address
The Geocode US Address stage has several options that affect performance. These options are in this file:
SpectrumLocation\server\modules\geostan\java.properties
- egm.us.multimatch.max.records
- Specifies the maximum number of matches to return. A smaller number results in better performance, but at the expense of matches.
- egm.us.multimatch.max.processing
- Specifies the number of searches to perform. A smaller number results in better performance, but at the expense of matches.
- FileMemoryLimit
- Controls how much of the reference data is initially loaded into memory.