Detect Objects Using Deep Learning (Image Analyst)—ArcGIS Pro

Available with Image Analyst license.

Summary

Runs a trained deep learning model on an input raster to produce a feature class containing the objects it finds. The features can be bounding boxes or polygons around the objects found or points at the centers of the objects.

This tool requires a model definition file containing trained model information. The model can be trained using the Train Deep Learning Model tool or by a third-party training software such as TensorFlow, PyTorch, or Keras. The model definition file can be an Esri model definition JSON file (.emd) or a deep learning model package, and it must contain the path to the Python raster function to be called to process each object and the path to the trained binary deep learning model file.

Usage

You must install the proper deep learning framework Python API (such as TensorFlow, PyTorch, or Keras) in the ArcGIS Pro Python environment; otherwise, an error will occur when you add the Esri model definition file to the tool. Obtain the appropriate framework information from the creator of the Esri model definition file.
To set up your machine to use deep learning frameworks in ArcGIS Pro, see Install deep learning frameworks for ArcGIS.
This tool calls a third-party deep learning Python API (such as TensorFlow, PyTorch, or Keras) and uses the specified Python raster function to process each object.
Sample use cases for this tool are available on the Esri Python raster function GitHub page. You can also write custom Python modules by following examples and instructions in the GitHub repository.
The Model Definition parameter value can be an Esri model definition JSON file (.emd), a JSON string, or a deep learning model package (.dlpk). A JSON string is useful when this tool is used on the server so you can paste the JSON string rather than upload the .emd file. The .dlpk file must be stored locally.

See the sample below for the .emd file.

{
    "Framework" :"TensorFlow",
    "ModelConfiguration": "ObjectDetectionAPI",
    
    "ModelFile": ".\\CoconutTreeDetection.model",
    "ModelType": "ObjectDetection",
    "ImageHeight": 850,
    "ImageWidth": 850,
    "ExtractBands": [0,1,2],
    "ImageSpaceUsed": "MAP_SPACE"
    "Classes": [
    {
        "Value": 0,
        "Name": "CoconutTree",
        "Color": [0, 255, 0]
    }
    ]
}

The tool can process input imagery that is in map space or in pixel space. Imagery in map space is in a map-based coordinate system. Imagery in pixel space is in raw image space with no rotation and no distortion. The reference system can be specified when generating the training data in the Export Training Data For Deep Learning tool using the Reference System parameter. If the model is trained in a third-party training software, the reference system must be specified in the .emd file using the ImageSpaceUsed parameter, which can be set to MAP_SPACE or PIXEL_SPACE.
Increasing the batch size can improve tool performance; however, as the batch size increases, more memory is used. If an out of memory error occurs, use a smaller batch size. The batch_size value can be adjusted using the Arguments parameter.
Batch sizes are square numbers, such as 1, 4, 9, 16, 25, 64 and so on. If the input value is not a perfect square, the highest possible square value is used. For example, if a value of 6 is specified, it means that the batch size is set to 4.
Use the Non Maximum Suppression parameter to identify and remove duplicate features from the object detection.
The input raster can be a single raster, multiple rasters, or a feature class with images attached. For more information about attachments, see Add or remove file attachments.
For information about requirements for running this tool and issues you may encounter, see Deep Learning frequently asked questions.
For more information about deep learning, see Deep learning in ArcGIS Pro.

Parameters

Label	Explanation	Data Type
Input Raster	The input image used to detect objects. The input can be a single raster or multiple rasters in a mosaic dataset, image service, or folder of images. A feature class with image attachments is also supported.	Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Map Server; Map Server Layer; Internet Tiled Layer; Folder; Feature Layer; Feature Class
Output Detected Objects	The output feature class that will contain geometries circling the object or objects detected in the input image.	Feature Class
Model Definition	This parameter can be an Esri model definition JSON file (.emd), a JSON string, or a deep learning model package (.dlpk). A JSON string is useful when this tool is used on the server so you can paste the JSON string rather than upload the .emd file. The .dlpk file must be stored locally. It contains the path to the deep learning binary model file, the path to the Python raster function to be used, and other parameters such as preferred tile size or padding.	File; String
Arguments (Optional)	The function arguments defined in the Python raster function class. This is where additional deep learning parameters and arguments for experiments and refinement are listed, such as a confidence threshold for adjusting the sensitivity. The names of the arguments are populated from the Python module.	Value Table
Non Maximum Suppression (Optional)	Specifies whether nonmaximum suppression will be performed in which duplicate objects are identified and the duplicate features with lower confidence value are removed. Unchecked—Nonmaximum suppression will not be performed. All objects that are detected will be in the output feature class. This is the default. Checked—Nonmaximum suppression will be performed and duplicate objects that are detected will be removed.	Boolean
Confidence Score Field (Optional)	The name of the field in the feature class that will contain the confidence scores as output by the object detection method. This parameter is required when the Non Maximum Suppression parameter is checked.	String
Class Value Field (Optional)	The name of the class value field in the input feature class. If a field name is not specified, a Classvalue or Value field will be used. If these fields do not exist, all records will be identified as belonging to one class.	String
Max Overlap Ratio (Optional)	The maximum overlap ratio for two overlapping features, which is defined as the ratio of intersection area over union area. The default is 0.	Double
Processing Mode (Optional)	Specifies how all raster items in a mosaic dataset or an image service will be processed. This parameter is applied when the input raster is a mosaic dataset or an image service. Process as mosaicked image—All raster items in the mosaic dataset or image service will be mosaicked together and processed. This is the default. Process all raster items separately—All raster items in the mosaic dataset or image service will be processed as separate images.	String

Derived Output

Label	Explanation	Data Type
Output Classified Raster	The output classified raster for pixel classification. The name of the raster dataset will be the same as the Output Detected Objects parameter value. This parameter is only applicable when the model type is Panoptic Segmentation.	Raster Dataset

DetectObjectsUsingDeepLearning(in_raster, out_detected_objects, in_model_definition, {arguments}, {run_nms}, {confidence_score_field}, {class_value_field}, {max_overlap_ratio}, {processing_mode})

Name	Explanation	Data Type
in_raster	The input image used to detect objects. The input can be a single raster or multiple rasters in a mosaic dataset, image service, or folder of images. A feature class with image attachments is also supported.	Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Map Server; Map Server Layer; Internet Tiled Layer; Folder; Feature Layer; Feature Class
out_detected_objects	The output feature class that will contain geometries circling the object or objects detected in the input image.	Feature Class
in_model_definition	The in_model_definition parameter value can be an Esri model definition JSON file (.emd), a JSON string, or a deep learning model package (.dlpk). A JSON string is useful when this tool is used on the server so you can paste the JSON string rather than upload the .emd file. The .dlpk file must be stored locally. It contains the path to the deep learning binary model file, the path to the Python raster function to be used, and other parameters such as preferred tile size or padding.	File; String
arguments [arguments,...] (Optional)	The function arguments defined in the Python raster function class. This is where additional deep learning parameters and arguments for experiments and refinement are listed, such as a confidence threshold for adjusting the sensitivity. The names of the arguments are populated from the Python module.	Value Table
run_nms (Optional)	Specifies whether nonmaximum suppression will be performed in which duplicate objects are identified and the duplicate features with lower confidence value are removed. NO_NMS—Nonmaximum suppression will not be performed. All objects that are detected will be in the output feature class. This is the default. NMS—Nonmaximum suppression will be performed and duplicate objects that are detected will be removed.	Boolean
confidence_score_field (Optional)	The name of the field in the feature class that will contain the confidence scores as output by the object detection method. This parameter is required when the run_nms parameter is set to NMS.	String
class_value_field (Optional)	The name of the class value field in the input feature class. If a field name is not specified, a Classvalue or Value field will be used. If these fields do not exist, all records will be identified as belonging to one class.	String
max_overlap_ratio (Optional)	The maximum overlap ratio for two overlapping features, which is defined as the ratio of intersection area over union area. The default is 0.	Double
processing_mode (Optional)	Specifies how all raster items in a mosaic dataset or an image service will be processed. This parameter is applied when the input raster is a mosaic dataset or an image service. PROCESS_AS_MOSAICKED_IMAGE—All raster items in the mosaic dataset or image service will be mosaicked together and processed. This is the default. PROCESS_ITEMS_SEPARATELY—All raster items in the mosaic dataset or image service will be processed as separate images.	String

Derived Output

Name	Explanation	Data Type
out_classified_raster	The output classified raster for pixel classification. The name of the raster dataset will be the same as the out_detected_objects parameter value. This parameter is only applicable when the model type is Panoptic Segmentation.	Raster Dataset

Code sample

DetectObjectsUsingDeepLearning example 1 (Python window)

This example creates a feature class based on object detection.

# Import system modules
import arcpy
from arcpy.ia import *

# Check out the ArcGIS Image Analyst extension license
arcpy.CheckOutExtension("ImageAnalyst")

DetectObjectsUsingDeepLearning("c:/detectobjects/moncton_seg.tif", 
     "c:/detectobjects/moncton_seg.shp", "c:/detectobjects/moncton.emd", 
     "padding 0; threshold 0.5; batch_size 4", "NO_NMS", "Confidence", 
     "Class", 0, "PROCESS_AS_MOSAICKED_IMAGE")

DetectObjectsUsingDeepLearning example 2 (stand-alone script)

This example creates a feature class based on object detection.

# Import system modules
import arcpy
from arcpy.ia import *

"""
Usage: DetectObjectsUsingDeepLearning( in_raster, out_detected_objects, 
       in_model_definition, {arguments}, {run_nms}, {confidence_score_field}, 
       {class_value_field}, {max_overlap_ratio}, {processing_mode})
"""

# Set local variables
in_raster = "c:/classifydata/moncton_seg.tif"
out_detected_objects = "c:/detectobjects/moncton.shp"
in_model_definition = "c:/detectobjects/moncton_sig.emd"
model_arguments = "padding 0; threshold 0.5; batch_size 4"
run_nms = "NO_NMS"
confidence_score_field = "Confidence"
class_value_field = "Class"
max_overlap_ratio = 0
processing_mode = "PROCESS_AS_MOSAICKED_IMAGE"
# Check out the ArcGIS Image Analyst extension license
arcpy.CheckOutExtension("ImageAnalyst")

# Execute 
DetectObjectsUsingDeepLearning( in_raster, out_detected_objects, 
   in_model_definition, model_arguments, run_nms, confidence_score_field, 
   class_value_field, max_overlap_ratio, processing_mode)

Environments

Cell Size, Current Workspace, Extent, Geographic Transformations, GPU ID, Mask, Output Coordinate System, Parallel Processing Factor, Processor Type, Scratch Workspace

Licensing information

Basic: Requires Image Analyst
Standard: Requires Image Analyst
Advanced: Requires Image Analyst

Summary

Usage

Parameters

Derived Output

Derived Output

Code sample

Environments

Licensing information

Related topics

In this topic