Collapse duplicate features in the data

Reference data can be formatted to contain duplicate features that represent the same location, but with different attributes, as a way of creating a locator that supports alternate names. This is illustrated in the data below, in which 12725 Yosemite Blvd, Waterford and 12725 CA-132, Waterford have the same geometry but different values in the FullStreetName field.

PointAddress attribute table with duplicate features for the same location with different names

The recommended method for creating a locator that supports alternate names for features is to add the alternate values to a table and use an alternate name table role that corresponds to the primary locator role. However, if duplicate features exist in the reference data, alternate values can be created and duplicate geometries will be excluded when the locator is built with the Create Locator tool. To remove duplicate geometries, the primary reference data should contain a field with an ID that connects the duplicate features with the same location. This ID field must be mapped to a Feature ID field from the locator role, such as POINT_ADDRESS_ID. This reduces the size of the locator and removes excessive tied candidates from geocoding results.

PointAddress attribute table with POINT_ADDRESS_ID field to link duplicate features for the same location

The Create Locator tool uses the values mapped to the Feature ID field to skip all duplicate geometries, except the first geometry that is encountered, which is stored in the locator. The alternate attribute values are created based on the matching IDs of the duplicate features.

POINT_ADDRESS_ID field assigned to the Feature ID locator role field in the Create Locator tool


If the reference data does not include the ID field, it can be added using the Find Identical tool. The Shape field can be used to find duplicates in the primary reference data based on the assumption that they have the same geometry. Duplicates can occur in the reference data when two separate addresses or places of interest (POIs) share the same location, which can be problematic. This procedure does not work in all cases. If the Shape field is used with the Find Identical tool, the output table will contain identical IDs for the duplicate features. Then, it can be joined with the primary reference data and used to build the locator by assigning the new ID field to the Feature ID locator role field in the Create Locator tool.

If you have a point feature class you want to use as primary reference data and it contains 13 million features, of which 10 million are unique features, mapping the Feature ID field will activate the functionality in the Create Locator tool to remove duplicate geometries. The result is a locator that is reduced from 253 MB to 200 MB in size.