Find Identical (Data Management)

Summary

Reports any records in a feature class or table that have identical values in a list of fields, and generates a table listing these identical records. If the field Shape is selected, feature geometries are compared.

The Delete Identical tool can be used to find and delete identical records.

Illustration

Find Identical illustration
In this example, points with the OBJECTIDs of 1,2, 3, 8, 9, and 10 are spatially coincident (blue highlight). The output table identifies those spatially coincident points that share the same CATEGORY.

Usage

  • Records are identical if values in the selected input fields are the same for those records. The values from multiple fields in the input dataset can be compared. If more than one field is specified, records are matched by the values in the first field, then by the values of the second field, and so on.

  • With feature class or feature layer input, select the field Shape in the Field(s) parameter to compare feature geometries to find identical features by location. The XY Tolerance and Z Tolerance parameters are only valid when Shape is selected as one of the input fields.

    If the Shape field is selected and the input features have M or Z values enabled, then the M or Z values are also used to determine identical features.

  • Check Output only duplicated records parameter if you want only the duplicated records in the output table. The output will have the same number of records as the input dataset If this parameter is unchecked (the default).

  • The output table will contain two fields: IN_FID and FEAT_SEQ.

    • The IN_FID field can be used to join the records of the output table back to the input dataset.
    • Identical records have the same FEAT_SEQ value while nonidentical records will have sequential value. FEAT_SEQ values have no relationship to IDs of input records.

Syntax

arcpy.management.FindIdentical(in_dataset, out_dataset, fields, {xy_tolerance}, {z_tolerance}, {output_record_option})
ParameterExplanationData Type
in_dataset

The table or feature class for which identical records will be found.

Table View
out_dataset

The output table reporting identical records. The FEAT_SEQ field in the output table will have the same value for identical records.

Table
fields
[fields,...]

The field or fields whose values will be compared to find identical records.

Field
xy_tolerance
(Optional)

The xy tolerance that will be applied to each vertex when evaluating if there is an identical vertex in another feature. This parameter is enabled only when Shape is selected as one of the fields.

Linear Unit
z_tolerance
(Optional)

The Z tolerance that will be applied to each vertex when evaluating if there is an identical vertex in another feature. This parameter is enabled only when Shape is selected as one of the fields.

Double
output_record_option
(Optional)

Choose if you want only duplicated records in the output table.

  • ALLAll input records will have corresponding records in the output table. This is the default.
  • ONLY_DUPLICATESOnly duplicate records will have corresponding records in the output table. The output will be empty if no duplicate is found.
Boolean

Code sample

FindIdentical example 1 (Python window)

The following Python window script demonstrates how to use the FindIdentical function in immediate mode.

import arcpy

# Find identical records based on a text field and a numeric field.
arcpy.FindIdentical_management("C:/data/fireincidents.shp", "C:/output/duplicate_incidents.dbf", ["ZONE", "INTENSITY"])
FindIdentical example 2 (stand-alone script)

The following stand-alone script demonstrates how to use the FindIdentical tool to identify duplicate records of a table or feature class.

# Name: FindIdentical_Example2.py
# Description: Finds duplicate features in a dataset based on location (Shape field) and fire intensity

import arcpy

arcpy.env.overwriteOutput = True

# Set workspace environment
arcpy.env.workspace = "C:/data/findidentical.gdb"

# Set input feature class
in_dataset = "fireincidents"

# Set the fields upon which the matches are found
fields = ["Shape", "INTENSITY"]

# Set xy tolerance
xy_tol = ".02 Meters"

out_table = "duplicate_incidents"

# Execute Find Identical 
arcpy.FindIdentical_management(in_dataset, out_table, fields, xy_tol)
print(arcpy.GetMessages())
FindIdentical example 3: Output only duplicate records (stand-alone script)

Demonstrates the use of the optional parameter Output only duplicated records. If checked on tool dialog box, or if set, the value of ONLY_DUPLICATES, then all unique records are removed. keeping only the duplicates from the output/

# Name: FindIdentical_Example3.py
# Description: Demonstrates the use of the optional parameter Output only duplicated records.

import arcpy

arcpy.env.overwriteOutput = True

# Set workspace environment
arcpy.env.workspace = "C:/data/redlands.gdb"

in_data = "crime"
out_data = "crime_dups"

# Note that XY Tolerance and Z Tolerance parameters are not used
# In that case, any optional parameter after them must assign
# the value with the name of that parameter    
arcpy.FindIdentical_management(in_data, out_data, ["Shape"], output_record_option="ONLY_DUPLICATES")

print(arcpy.GetMessages())
FindIdentical example 4: Group identical records by FEAT_SEQ value (stand-alone script)

Reads the output of FindIdentical tool and groups identical records by FEAT_SEQ value.

import arcpy

from itertools import groupby
from operator import itemgetter

# Set workspace environment
arcpy.env.workspace = r"C:\data\redlands.gdb"

# Run Find Identical on feature geometry only.
result = arcpy.FindIdentical_management("parcels", "parcels_dups", ["Shape"])
    
# List of all output records as IN_FID and FEAT_SEQ pair - a list of lists
out_records = []   
for row in arcpy.SearchCursor(result.getOutput(0), fields="IN_FID; FEAT_SEQ"):
    out_records.append([row.IN_FID, row.FEAT_SEQ])

# Sort the output records by FEAT_SEQ values
# Example of out_records = [[3, 1], [5, 3], [1, 1], [4, 3], [2, 2]]
out_records.sort(key = itemgetter(1))
    
# records after sorted by FEAT_SEQ: [[3, 1], [1, 1], [2, 2], [5, 3], [4, 3]]
# records with same FEAT_SEQ value will be in the same group (i.e., identical)
identicals_iter = groupby(out_records, itemgetter(1))
    
# now, make a list of identical groups - each group in a list.
# example identical groups: [[3, 1], [2], [5, 4]]
# i.e., IN_FID 3, 1 are identical, and 5, 4 are identical.
identical_groups = [[item[0] for item in data] for (key, data) in identicals_iter]

print(identical_groups)

Licensing information

  • Basic: No
  • Standard: No
  • Advanced: Yes

Related topics