Interactive object detection basics—ArcGIS Pro

Available with Advanced license.

Interactive object detection is used to find objects of interest from imagery displayed in a scene.

Object detection relies on a deep learning model that has been trained to detect specific objects in an image such as windows and doors in buildings in a scene. Detection results are automatically saved to a point feature class with a confidence score, bounding-box dimensions, and the label-name as attributes.

This tool requires the installation of the Deep Learning Libraries prior to being run.

License:

An ArcGIS Pro Advanced license level is required to perform object detection.

The images below illustrate the object detection result returned with the different symbology options.

Interactive object detection using box symbology

Interactive object detection using location point symbology

Detect objects in a 3D view

The Object Detection tool is available in the Exploratory 3D Analysis drop-down menu in the Workflows group on the Analysis tab. After selecting the Object Detection tool, the Exploratory Analysis pane appears. Use the Exploratory Analysis pane to modify or accept the object detection parameters and set which camera method determines how the tool runs for detection results. The first time the tool is run, the model is loaded and the detections calculated. Additional runs do not require reloading the model and will take less time. If you change the model selection, it will require the initial loading time again.

Object detection properties

The properties for object detection are described in the following table:


Option	Description
Model	The deep learning package (.dlpk) to use for detecting objects. The trained model must be a FasterRCNN model. Expand the Model input drop-down arrow and click Download to automatically get the pretrained Esri Windows and Doors model. Optionally, click Browse to choose a local deep learning package or download from ArcGIS Online.
Classes	The list of real-world objects to detect. This list is populated from the .dlpk file. The default is set to All.
Minimum Confidence Level	The minimum detection score a detection must meet. Detections with scores lower than this level are discarded. The default value is 0.5.
Maximum Overlap Threshold	The intersection over union threshold with other detections. If detection results overlap, the one with the highest score is considered a true positive. The default value is 0.
Process using GPU	Use the graphics processing unit (GPU) processing power instead of the computer processing unit (CPU) processing power. Recommended if you have a very good graphics card with at least 8 Gb of dedicated GPU memory.
Feature layer	The name of the output feature layer. If the layer does not exist, a feature class is created in the project's default geodatabase and added to the current map or scene. If the layer is already in the view and has the required schema, newly detected objects are appended to the existing feature class. If you rerun the tool when the layer is not in the current map or scene, a new uniquely-named feature class is created in the default geodatabase and added to the view.
Description	The description to be included in the attribute table. Multiple detection results can be saved to the same feature layer and a description can be used to differentiate between these multiple detections.
Symbology	Set the returned shape of the output feature layer using the default color of electron gold. The symbology choices are: Location Point—an X marking the centerpoint of the feature. This is the default. Vertical Bounding Box (3D only)—a vertical semi-transparent filled bounding box. Use the vertical bounding box symbology in scenes for deep-learning models that detect vertical objects, such as windows and doors. Horizontal Bounding Box (3D only)—a horizontal semi-transparent filled bounding box. Use the horizontal bounding box symbology in scenes for deep-learning models that detect horizontal objects, such as swimming pools. If the output layer is already in the view and has custom symbology, its symbology is not changed when the tool is run.

Object detection methods

The methods for object detection are described in the following table:


Method	Description
Current Camera	This is the default creation method. It uses the current camera position to detect objects.
Reposition Camera (3D only)	Repositions the camera to a horizontal or vertical viewpoint before detecting objects. Set up the area of interest viewpoint and use this to fine-tune the alignment. It is not recommended for positioning the camera on objects in the distance to bring them closer in the view.

Update detection results

To change the output results—for example, using a different confidence value or choosing another area of interest—change those properties and run the Object Detection tool again. Newly discovered object will be appended to the same layer. Alternatively, provide a new name and create another output feature layer for comparison. It is not recommended that you manually update the attribute values of object detection results.

Tip:

Before re-running the tool, turn the layer visibility off for the previous detection results. Otherwise, those results may overlap objects being detected and could affect detection results.

Delete detection results

Detection results are added as point features. As such, you can delete individual features using the standard editing workflows. Alternatively, delete the entire feature class from the project's default geodatabase. Removing the layer from the Contents pane does not automatically delete your results, as they still exist in the geodatabase. If you rerun the tool when the layer is not in the current map or scene, a new uniquely-named feature class is created in the default geodatabase and added to the view.