Detect Objects Using Deep Learning (Image Analyst)—ArcGIS Pro

Available with Image Analyst license.

Summary

Runs a trained deep learning model on an input raster to produce a feature class containing the objects it finds. The features can be bounding boxes or polygons around the objects found or points at the centers of the objects.

This tool requires a model definition file containing trained model information. The model can be trained using the Train Deep Learning Model tool or by a third-party training software such as TensorFlow, PyTorch, or Keras. The model definition file can be an Esri model definition JSON file (.emd) or a deep learning model package, and it must contain the path to the Python raster function to be called to process each object and the path to the trained binary deep learning model file.

Usage

You must install the proper deep learning framework Python API (such as TensorFlow, PyTorch, or Keras) in the ArcGIS Pro Python environment; otherwise, an error will occur when you add the Esri model definition file to the tool. Obtain the appropriate framework information from the creator of the Esri model definition file.
To set up your machine to use deep learning frameworks in ArcGIS Pro, see Install deep learning frameworks for ArcGIS.
This tool calls a third-party deep learning Python API (such as TensorFlow, PyTorch, or Keras) and uses the specified Python raster function to process each object.
Sample use cases for this tool are available on the Esri Python raster function GitHub page. You can also write custom Python modules by following examples and instructions in the GitHub repository.
The Esri model definition parameter value can be an Esri model definition JSON file (.emd), a JSON string, or a deep learning model package (.dlpk). A JSON string is useful when this tool is used on the server so you can paste the JSON string, rather than upload the .emd file. The .dlpk file must be stored locally.

See the sample below for the .emd file.

{
    "Framework" :"TensorFlow",
    "ModelConfiguration": "ObjectDetectionAPI",
    
    "ModelFile": ".\\CoconutTreeDetection.model",
    "ModelType": "ObjectDetection",
    "ImageHeight": 850,
    "ImageWidth": 850,
    "ExtractBands": [0,1,2],
    "ImageSpaceUsed": "MAP_SPACE"
    "Classes": [
    {
        "Value": 0,
        "Name": "CoconutTree",
        "Color": [0, 255, 0]
    }
    ]
}

Use the Non Maximum Suppression parameter to identify and remove duplicate features from the object detection.
The tool can process input imagery that is in map space or in pixel space. Imagery in map space is in a map-based coordinate system. Imagery in pixel space is in raw image space with no rotation and no distortion. The reference system can be specified when generating the training data in the Export Training Data For Deep Learning tool using the Reference System parameter. If the model is trained in a third-party training software, the reference system must be specified in the .emd file using the ImageSpaceUsed parameter, which can be set to MAP_SPACE or PIXEL_SPACE.
The input raster can be a single raster, multiple rasters, or a feature class with images attached. For more information on attachments, see Add or remove file attachments.
For information about requirements for running this tool and issues you may encounter, see the Deep Learning Frequently Asked Questions.
For more information about deep learning, see Deep learning in ArcGIS Pro.

Syntax

DetectObjectsUsingDeepLearning(in_raster, out_detected_objects, in_model_definition, {arguments}, {run_nms}, {confidence_score_field}, {class_value_field}, {max_overlap_ratio}, {processing_mode})

Parameter	Explanation	Data Type
in_raster	The input image used to detect objects. The input can be a single raster or multiple rasters in a mosaic dataset, image service, or folder of images. A feature class with image attachments is also supported.	Raster Dataset; Raster Layer; Mosaic Layer; Image Service; MapServer; Map Server Layer; Internet Tiled Layer; Folder; Feature Layer; Feature Class
out_detected_objects	The output feature class that will contain geometries circling the object or objects detected in the input image.	Feature Class
in_model_definition	The in_model_definition parameter value can be an Esri model definition JSON file (.emd), a JSON string, or a deep learning model package (.dlpk). A JSON string is useful when this tool is used on the server so you can paste the JSON string, rather than upload the .emd file. The .dlpk file must be stored locally. It contains the path to the deep learning binary model file, the path to the Python raster function to be used, and other parameters such as preferred tile size or padding.	File; String
arguments [arguments,...] (Optional)	The function arguments defined in the Python raster function class. This is where you list additional deep learning parameters and arguments for experiments and refinement, such as a confidence threshold for adjusting the sensitivity. The names of the arguments are populated by the tool from reading the Python module.	Value Table
run_nms (Optional)	Specifies whether nonmaximum suppression will be performed, in which duplicate objects are identified and the duplicate features with lower confidence value are removed. NO_NMS —Nonmaximum suppression is not performed. All objects that are detected will be in the output feature class. This is the default. NMS —Nonmaximum suppression is performed, and duplicate objects that are detected will be removed.	Boolean
confidence_score_field (Optional)	The name of the field in the feature class that contains the confidence scores as output by the object detection method. This parameter is required when the NMS keyword for the run_nms parameter is used.	String
class_value_field (Optional)	The name of the class value field in the input feature class. If a field name is not specified, a Classvalue or Value field will be used. If these fields do not exist, all records will be identified as belonging to one class.	String
max_overlap_ratio (Optional)	The maximum overlap ratio for two overlapping features, which is defined as the ratio of intersection area over union area. The default is 0.	Double
processing_mode (Optional)	Specifies how all raster items in a mosaic dataset or an image service will be processed. This parameter is applied when the input raster is a mosaic dataset or an image service. PROCESS_AS_MOSAICKED_IMAGE —All raster items in the mosaic dataset or image service will be mosaicked together and processed. This is the default. PROCESS_ITEMS_SEPARATELY —All raster items in the mosaic dataset or image service will be processed as separate images.	String

Code sample

DetectObjectsUsingDeepLearning example 1 (Python window)

This example creates a feature class based on object detection.

# Import system modules
import arcpy
from arcpy.ia import *

# Check out the ArcGIS Image Analyst extension license
arcpy.CheckOutExtension("ImageAnalyst")

DetectObjectsUsingDeepLearning("c:/detectobjects/moncton_seg.tif", 
     "c:/detectobjects/moncton_seg.shp", "c:/detectobjects/moncton.emd", 
     "padding 0; threshold 0.5; batch_size 4", "NO_NMS", "Confidence", 
     "Class", 0, "PROCESS_AS_MOSAICKED_IMAGE")

DetectObjectsUsingDeepLearning example 2 (stand-alone script)

This example creates a feature class based on object detection.

# Import system modules
import arcpy
from arcpy.ia import *

"""
Usage: DetectObjectsUsingDeepLearning( in_raster, out_detected_objects, 
       in_model_definition, {arguments}, {run_nms}, {confidence_score_field}, 
       {class_value_field}, {max_overlap_ratio}, {processing_mode})
"""

# Set local variables
in_raster = "c:/classifydata/moncton_seg.tif"
out_detected_objects = "c:/detectobjects/moncton.shp"
in_model_definition = "c:/detectobjects/moncton_sig.emd"
model_arguments = "padding 0; threshold 0.5; batch_size 4"
run_nms = "NO_NMS"
confidence_score_field = "Confidence"
class_value_field = "Class"
max_overlap_ratio = 0
processing_mode = "PROCESS_AS_MOSAICKED_IMAGE"
# Check out the ArcGIS Image Analyst extension license
arcpy.CheckOutExtension("ImageAnalyst")

# Execute 
DetectObjectsUsingDeepLearning( in_raster, out_detected_objects, 
   in_model_definition, model_arguments, run_nms, confidence_score_field, 
   class_value_field, max_overlap_ratio, processing_mode)

Environments

Cell Size, Current Workspace, Extent, Geographic Transformations, GPU ID, Output Coordinate System, Parallel Processing Factor, Processor Type, Scratch Workspace

Licensing information

Basic: Requires Image Analyst
Standard: Requires Image Analyst
Advanced: Requires Image Analyst