Interactive object detection—ArcGIS Pro

Available with Advanced license.

Available with Image Analyst license.

Object detection is used to find objects of interest from imagery displayed in a map or scene.

Object detection relies on a deep learning model that has been trained to detect specific objects such as windows and doors in buildings in a scene. Detection results are saved to a point feature class with a confidence score, bounding-box dimensions, and the label name as attributes. You can also interactively detect other objects—for example, parked aircraft or airport structures—using a generic model by clicking in the view to detect the result.

You must install Deep Learning Libraries to use object detection.

License:

The interactive object detection tool requires either an ArcGIS Pro Advanced license or the ArcGIS Image Analyst extension.

The Object Detection tool is on the Exploratory 3D Analysis drop-down menu in the Workflows group on the Analysis tab. After selecting the Object Detection tool, the Exploratory Analysis pane appears.

Use the Exploratory Analysis pane to modify the object detection parameters and set which camera method to use for detection results. The first time the tool is run, the model used is the Esri Windows and Doors model. The model is loaded and the detections are calculated. Additional runs do not require reloading the model and will take less time. If you change the model selection, the new model must be loaded again. The Generic Object model does not require a model to be downloaded.

The images below illustrate the object detection result returned with the symbology options available: a box symbology or a location center point X symbol.

Interactive object detection using box symbology

Interactive object detection using location point symbology

Detect objects in a scene

The Object Detection tool can work with any supported model that is trained to detect particular objects. The tool includes a model specific to detecting windows and doors, as well as a generic model for detecting other objects interactively.

The Esri Windows and Doors deep learning model detects windows and doors as point features. The object detection parameters for using the Esri Windows and Doors model are described in the following table:


Object detection parameters for the Windows and Doors model	Description
Model	The deep learning package (.dlpk) to use for detecting objects. The model types supported include FasterRCNN, YOLOv3, Single Shot Detector (SSD), and RetinaNet. Expand the Model input drop-down arrow and click Download Model to access the pretrained Esri Windows and Doors model. Optionally, click Browse to choose a local deep learning package or download one from ArcGIS Online.
Classes	The list of real-world objects to detect. This list is populated from the .dlpk file. The default is set to All, but you can specify only windows or only doors.
Minimum Confidence Level	The minimum detection score a detection must meet. Detections with scores lower than this confidence level are discarded. The default value is 0.5.
Maximum Overlap Threshold	The intersection over union threshold with other detections. If detection results overlap, the one with the highest score is considered a true positive. The default value is 0.
Process using GPU	Use the graphics processing unit (GPU) processing power instead of the computer processing unit (CPU) processing power. This is recommended if you have a graphics card with at least 8 GB of dedicated GPU memory.
Feature layer	The name of the output feature layer. If the layer does not exist, a feature class is created in the project's default geodatabase and added to the current map or scene. If the layer is already in the map or scene and has the required schema, newly detected objects are appended to the existing feature class. If you rerun the tool when the layer is not in the current map or scene, a new uniquely named feature class is created in the default geodatabase and added to the map or scene.
Description	The description to be included in the attribute table. Multiple detection results can be saved to the same feature layer and a description can be used to differentiate between these multiple detections.
Symbology	Set the returned shape of the output feature layer using the default color of electron gold. The following are the symbology choices: Location Point—An X marking the centerpoint of the feature. This is the default. Vertical Bounding Box (3D only)—A vertical semitransparent filled bounding box. Use the vertical bounding box symbology in scenes for deep-learning models that detect vertical objects, such as windows and doors. Horizontal Bounding Box (3D only)—A horizontal semitransparent filled bounding box. Use the horizontal bounding box symbology in scenes for deep-learning models that detect horizontal objects, such as swimming pools. If the output layer is already in the map or scene and has custom symbology, the symbology is not changed when the tool is run.
Maximum Distance	Available in 3D only. Under the Filter Results heading, set the maximum distance from the camera to which results will be retained. Anything beyond the set depth is ignored.
Width	Under the Filter Results heading, set the minimum and maximum width values for the size of the expected returned result.
Height	Under the Filter Results heading, set the minimum and maximum height values for the size of the expected returned result.

The creation methods for object detection are described in the following table:


Creation Method	Description
Current Camera	This is the default creation method. It uses the current camera position to detect objects in the view.
Reposition Camera (Available in 3D scenes only)	Repositions the camera to a horizontal or vertical viewpoint before detecting objects. Set up the area of interest viewpoint and use this to fine-tune the alignment. Do not use this method to position the camera on objects in the distance to bring them closer in the view.

Detect objects using the current camera position

This is the default detection creation method for the Esri Windows and Doors model. Objects are detected based on the additional parameters defined in the Exploratory Analysis pane.

After the Current Camera method is used, it remains active to continue detecting objects. You can navigate to a different area and detect objects again. This ensures that the model does not need to be reloaded and the results are returned faster. If you use a different deep learning package (.dlpk) model, it will reload.

Reposition the camera

Detect windows and doors in the current scene by setting a viewpoint and repositioning the camera to look at that viewpoint. Window and door objects are detected based on the parameters defined in the Exploratory Analysis pane. This method is available in scenes only.

This method allows you to set the camera view direction before performing the object detection. For example, set a horizontal view direction if you click a building facade where you want to detect windows. A vertical view direction is useful for a top-down camera angle, such as detecting swimming pools. The camera automatically adjusts.

Tip:

This method is not intended to reposition the view closer, so that far away objects of interest are more easily detectable. You should still manually navigate close to the object of interest. Then the camera orients vertically or horizontally on the clicked target to detect objects.

The Reposition Camera method remains active to continue detecting objects. Click to define another viewpoint and detect objects again.

Generic object detection

Use the Esri Generic Object deep learning model to interactively detect individual objects such as vehicles, structures, and people in a map or scene. Instead of using the camera, you can click directly in the view to detect results. Some detection options such as classes, confidence level, overlap threshold, and processing power are not available. Results are stored as point features using the symbology option set for the tool.

The parameters for object detection using the Esri Generic Object model are described in the following table:


Object detection parameters for the Generic Object model	Description
Model	Expand the Model drop-down list and choose Esri Generic Object to define the object detection process.
Feature Layer	The name of the output feature layer. If the layer does not exist, a feature class is created in the project's default geodatabase and added to the current map or scene. If the layer is already in the map or scene and has the required schema, newly detected objects are appended to the existing feature class. If you rerun the tool when the layer is not in the current map or scene, a new uniquely named feature class is created in the default geodatabase and added to the view.
Description	The description to be included in the attribute table as a field. Multiple detection results can be saved to the same feature layer and a description can be used to differentiate between these multiple detections.
Symbology	Set the returned shape of the output feature layer using the default color of electron gold. The following are the symbology choices: Location Point—An X marking the centerpoint of the feature. This is the default. Vertical Bounding Box (3D only)—A vertical semitransparent filled bounding box. Horizontal Bounding Box (3D only)—A horizontal semitransparent filled bounding box. If the output layer is already in the map or scene and has custom symbology, the symbology is not changed when the tool is run.
Creation Method	Interactive Detection —Click to detect individual objects at that location.

Update object detection results

To change the output results—for example, use a different confidence value or choose another area of interest—change those properties and run the Object Detection tool again. Newly discovered objects are appended to the same output layer.

Note:

If the results layer is not in the current map or scene when the tool is rerun, a new uniquely named feature class is created in the default geodatabase and added as a layer to the map or scene.

Alternatively, provide a new name and create another output feature layer for comparison. It is recommended that you do not manually update the attribute values of object detection results. Expand the Filter Results section to specify size and distance values to fine-tune returned results.

Tip:

Before rerunning the tool, turn the layer visibility off for the previous detection results. Otherwise, those results may overlap objects being detected and could affect detection results.

License:

Detect objects in a scene

Detect objects using the current camera position

Reposition the camera

Tip:

Generic object detection

Update object detection results

Note:

Tip:

Related topics

In this topic