Using geocoding features

This section contains information on geocoding concepts used by GeoStan:

Note: For users of Master Location Data, additional options are available. See Using Master Location Data.

Using geocode placement options

You can set the following options that affect how GeoStan calculates the geocode for an interpolated street match.

Option

Search Types

Offset

Ensures the point does not reside in the middle of a street. Moves the point perpendicular to the portion of the street segment in which it lands by the value you specify. The default is 50 feet for the best visual representation for mapping packages.

 

•Address

•Reverse Geocoding

Centerline Offset

Similar to Offset, but used only with centerline matching.   Moves the point from the street centerline toward the parcel point the value you specify. This is useful for routing applications. The default is 0 feet to return the street centerline geocode.

•Address

Squeeze

Ensures the point does not reside in an intersection or too close to the end of a street. Using a squeeze distance moves both street end points closer to the center of the segment by the value you indicate.

•Address

•Reverse Geocoding

 

Using point-level matching

 

Point-level matching locates the center of the actual building footprint or parcel. This is the most accurate type of geocode and is used in industries such as internet mapping, flood hazard determination, property and casualty insurance, telecommunications, and utilities.

If you are licensed for the point-level data option, you do not need to execute any additional initialization or setup after you have installed the point-level data. GeoStan automatically processes your address lists through the point-level data.

When processing address lists, GeoStan first searches for a match in the point-level data. If it cannot find an exact match in the point-level data, GeoStan continues searching for a better match in the street network data. GeoStan returns the best match found, with preference given to matches from the point-level dataset.

Note: For users of Master Location Data, additional options are available. See Using Master Location Data.

Using centerline matching

Centerline matching is used with point-level matching to tie a point-level geocode with its parent street segment. This functionality is useful for routing applications.

This provides you with additional data about the parent street segment that is not retrievable using only the point-level match. The output information also includes the bearing and distance from the point data geocode to the centerline match.

Note: Centerline matching requires a license for point-level matching. For more information on point-level matching, see Using point-level matching.

To use centerline matching:

  • Make sure you have loaded point-level data.

If you have loaded point-level data, GeoStan automatically processes your address using the point-level data.

Note: If GeoStan cannot find a match within the point-level data, GeoStan returns data from the centerline of the segment without any offset.
  • Use the GS_FIND_CENTERLINE_OFFSET find property to return a geocode for the centerline, offset from the center of the street, like the curb or sidewalk. You cannot set the point farther back from the street then the associated point-level geocode.

  • Optionally, use the bearing enum to receive the compass direction of the point data match to the centerline match. Knowing the bearing is useful for routing applications.

Understanding address point interpolation

Address point interpolation uses a patented process that improves upon regular street segment interpolation by inserting point data into the interpolation process. To utilize this feature, set GS_FIND_ADDRPOINT_INTERP to True.

Note: See Understanding street-level matching for more information on street segment interpolation.

When an address point User Dictionary or a point GSD is present, more precise address geometry is used for interpolation than what is available by the use of street segments alone.

GeoStan first attempts to find a match using the loaded point data, in priority order, for more information on priority order please see Specifying database search order. If an exact point match is found in the point data, then searching ceases and the point match is returned. If an exact point match was not found, GeoStan attempts to find high and low boundary address points to use for address point interpolation.

Note: This feature does not work with point addresses in auxiliary files.

To illustrate the use of this feature, see the example on the following page. In the example, if the input house number is 71, the point GSD contains address points for 67 and 77.

The street segment ranges from 11 to 501. The street segment contains shape lines describing the actual layout of the street.

GeoStan attempts to map the points for addresses 67 and 77 onto the closest shape line. After finding a point on the centerline of the street, GeoStan then performs the interpolation for the input house number 71 with the new street centerline points of 67 and 77.

Without this feature, GeoStan performs an interpolation with the street segment end points of 11 and 501. This creates a far less accurate result (labeled in the diagram) than using the centerline points of the closest surrounding high and low address points.

 

In the occasional scenario, if there is a situation where the boundaries found have the same parity but are on opposite sides of the street. To determine on which side of the street the address should be, address point interpolation uses information from the matched street segment.

Understanding street-level matching

 

Street matching identifies the approximate location of an address on a street segment. In street matching, the location is determined by calculating the approximate location of a house number based on the range of numbers in the location’s street, a process referred to as interpolation. For example, if the address is on a street segment with a range of addresses from 50 to 99, then it is assumed that the house number 75 would be in the middle of the street segment. This method assumes that the addresses are evenly spaced along the street segment. As a result, it is not as exact as point matching since addresses may not be evenly distributed along a street segment.

For example, the following diagram shows the results of street-level matching along a segment with unevenly-spaced buildings. The first three buildings are fairly accurately geocoded because they are evenly spaced. The fourth building, however, resides on a slightly larger parcel than the others along this street. Since street-level matching assumes that the buildings are evenly spaced, the result is that fourth, fifth, and sixth houses are not as precise as the first three. If you were to use point-level geocoding, or address point interpolation, the results would be more accurate.

 

Understanding segment parity and direction

The segment parity (GS_SEGMENT_PARITY) and segment direction (GS_SEGMENT_DIRECTION) are data that can be provided when an address is matched to a street segment. In the C API, the GsDataGet function, or in the case of a multimatch candidate, GsMultipleGet, returns the address lookup data including these two segment properties. The segment parity indicates on which side of the street the odd-numbered houses are located. The segment direction tells you whether the house numbers increase or decrease from the segment’s starting latitude/longitude coordinates.

The following diagram illustrates the concepts of segment parity and direction. Both of these segment properties are defined from the viewpoint of the segment’s starting latitude/longitude coordinates shown on the left side of the illustration. The definitions for segment parity and direction follow:

Segment parity

The segment parity describes which side of the street has the odd house numbers relative to the segment’s starting latitude/longitude coordinates, as follows:

  • L/Left/1 = odd house numbers are on the left side of the street and even house numbers are on the right.

  • R/Right/2 = odd house numbers are on the right side of the street and even house numbers are on the left.

  • B/Both/0 = odd and even house numbers are on both the left and the right sides of the street (interpolated point is always on the segment).

Segment direction

The segment direction describes if the house numbers increase or decrease from the starting latitude/longitude coordinates to the ending latitude/longitude coordinates of the segment.

  • F/Forward/1 = house numbers increase.

  • R/Reverse/0 = house numbers decrease.

Understanding street locator geocoding

Street locator geocoding is an optional feature, enabled usingGS_FIND_STREET_CENTROID. When enabled, if an input street address cannot be found using the street number and name, GeoStan then searches the input ZIP Code or city/state for the closest match. If GeoStan is able to locate the street, it returns a geocode along the matched street segment rather than the geocode for the entered ZIP Code or ZIP + 4.

If street locator is enabled, and the input address is 5000 Walnut Street, Boulder, CO 80301, and there is no 5000 Walnut Street, GeoStan searches for the closest match to that address within the input ZIP Code. If there is no input ZIP Code, GeoStan searches for the closest match to the input address within Boulder, CO.

If the input address is Walnut Street, Boulder, CO 80301, since there is no street number, GeoStan then searches for that street within the input ZIP Code. As with the previous example, if there is no input ZIP Code, then GeoStan searches within Boulder, CO for the closest match.

When using street locator geocoding, if no exact matching house number is found, a match code of either E029 (no matching range, single street segment found), or E030 (no matching range, multiple street segment) returns. The GeoStan Find call is returned by GsDataGet. GS_ADDRESS_NOT_FOUND or GS_ADDRESS_NOT_RESOLVED respectively for the match codes E029 and E030. For example, if you enter Main St and there are both an E Main St and a W Main St within the input ZIP Code then an E030 returns and the location code returned is reflective of the input ZIP Code. The location code returned begins with a 'C' when matched to a single street segment, indicated by E029.

GeoStan does not change the street name on the output address. In order to access the street name that corresponds to the geocode which was returned, the GsMultipleGet function must be called. Street range information for the matched street or streets is returned by GsMultipleGet, and the match codes returned indicate the changes made to the street name to make the match (as with regular match codes). For more information regarding the match and location codes associated with this feature, see Status codes overview and Street centroid location codes.

Note: This option is not available in CASS mode.

Understanding ZIP Code centroid matching

 

ZIP Code centroid matching is a center point of an area defined by either a ZIP Code or a ZIP + 4, and is the least accurate type of geocode. A ZIP Centroid is the center of a ZIP Code; a ZIP + 4 centroid is the center of a ZIP + 4. Since a ZIP + 4 represents a smaller area than a ZIP Code, a ZIP + 4 centroid is more accurate than a ZIP Code centroid.

The following diagram illustrates centroid matching. In the following example:

All six houses would have the same ZIP Code geocode because they all reside in the same ZIP Code. The four houses located in the dashed area of the diagram would have the same ZIP + 4 centroid returned, whereas the two houses that are outside of the dashed area would not since they don’t reside in the ZIP + 4 Code area..

 

Matching to a geographic centroid

If the input address includes a valid combination of city, county, and state (but no further address information), you can still geocode to the city, county, or state centroid. Geographic centroid geocoding is less precise than street or postal geocoding, but may be suitable for certain applications. Geographic centroid geocoding can be accomplished using GsFindGeographicFirst and GsFindGeographicNext, if GeoStan cannot match a record to the level or precision you originally requested (such as street level).

For geographic geocoding, GeoStan returns the most precise geographic centroid that it can, based on the user input. It is recommended that the Last-line lookup functions of GeoStan be used to standardize and parse the address elements prior to using GsFindGeographicFirst and GsFindGeographicNext as these functions only provide some basic fuzzy matching on the input values and do not do the kind of comprehensive matching and parsing achieved through the Last-line lookup functions.

The following table shows some examples of address input and the best possible geographic centroid candidate. For more information on the location codes listed, see Geographic centroid location codes.

Input address

Geocoded to (Location code)

Valid City

Valid County

Valid State

Troy, Rensselaer, NY

City (GM). City, County, and State are all valid.

Invalid City

Valid County

Valid State

San Diego, Albany, NY

County (GC) There is no city of San Diego in NY. However, there is an Albany County in NY, so County is the best possible match.

Invalid City

Invalid County

Valid State

Phoenix, Maricopa, NY

State (GS). There is no city of Phoenix in NY, nor is there a Maricopa County in NY, so State is the best possible match.

Valid City

Invalid County

Valid State

Albany, Saratoga, NY

City (GM). There is a city of Albany in NY. There  is a Saratoga County in NY, but Albany is not in that county. However, the city match is still made.

Valid Major City

Chicago

City (GM). There are approximately 300 U.S. cities recognized as a major city  and can be geocoded to the city centroid with no other information provided.

Valid City    

No County

Valid State

Albany, NY

Seven matched candidates. The city of Albany, NY is a close match (GM). Five instances of cities in New York that "sound like" Albany (such as Albion) are non-close GM matches. The state centroid (GS) is also a non-close match.

Valid City

Valid County

No State

Iberia, New Iberia

No matched candidates. Since the state (LA) is missing, only a city match can be attempted. Since Iberia is not recognized as one of the Major U.S. Cities, no match is possible.

Invalid City

No County

Valid State

St. Louis, NY

Three matched candidates. The state centroid (GS) is a close match and two city centroids (GM) are non-close "sound like" matches in New York State. If the input contains a state, all matches must be within that state.

No City

Valid County

Valid State

Otsego, NY

Two matched candidates. The county centroid (GC) is a close match and the state centroid (GM) is a non-close match (GM).    

No City

Valid County

No State

Otsego

No matched candidates. A county name alone is not enough for a match.

Valid City

No County

No State

Saco

No matched candidates. A city name alone is not enough for a match, except if it is one of the recognized Major U.S. Cities.