Classify Objects Using Deep Learning (Image Analyst)

ArcGIS Pro 3.4 | | Help archive

Available with Image Analyst license.

Summary

Runs a trained deep learning model on an input raster and an optional feature class to produce a feature class or table in which each input object or feature has an assigned class or category label.

This tool requires a model definition file containing trained model information. The model can be trained using the Train Deep Learning Model tool or by a third-party training software such as TensorFlow, PyTorch, or Keras. The model definition file can be an Esri model definition JSON file (.emd) or a deep learning model package, and it must contain the path to the Python raster function to be called to process each object and the path to the trained binary deep learning model file.

Usage

  • You must install the proper deep learning framework Python API (PyTorch or Keras) in the ArcGIS Pro Python environment; otherwise, an error will occur when you add the Esri model definition file to the tool. The appropriate framework information should be provided by the person who created the Esri model definition file.

    To set up your machine to use deep learning frameworks in ArcGIS Pro, see Install deep learning frameworks for ArcGIS.

  • This tool calls a third-party deep learning Python API (such as PyTorch or Keras) and uses the specified Python raster function to process each object.

  • Sample use cases for this tool are available on the Esri Python raster function GitHub page. You can also write custom Python modules by following the examples and instructions.

  • The Model Definition parameter value can be an Esri model definition JSON file (.emd), a JSON string, or a deep learning model package (.dlpk). A JSON string is useful when this tool is used on the server so you can paste the JSON string rather than upload the .emd file. The .dlpk file must be stored locally.

  • The following code sample is for an .emd file:

    {
        "Framework": "Keras",
        "ModelConfiguration":"KerasClassifier",
        "ModelFile":"C:\\DeepLearning\\Damage_Classification_Model_V7.h5",
        "ModelType":"ObjectClassification",
        "ImageHeight":256,
        "ImageWidth":256,
        "ExtractBands":[0,1,2],
        "CropSizeFixed": 1,
        "BlackenAroundFeature": 1,
        "ImageSpaceUsed": "MAP_SPACE", 
        "Classes": [
        {
           "Value": 0,
           "Name": "Damaged",
           "Color": [255, 0, 0]
        },
        {
           "Value": 1,
           "Name": "Undamaged",
           "Color": [76, 230, 0]
        }
        ]
    }
  • The CropSizeFixed property defines the crop mode of the raster tile around each object. A value of 1 means a fixed raster tile will be used, defined by the ImageHeight and ImageWidth properties in the .emd file. The object is centered within the fixed tile size. A value of 0 means a variable tile size will be used in which the raster tile is cropped using the smallest bounding box around the object.

  • The BlackenAroundFeature property specifies whether the pixels that are outside each object will be masked. A value of 0 means the pixels outside of the object will not be masked. A value of 1 means the pixels outside of the object will be masked.

  • The tool can process input imagery that is in map space or in pixel space. Imagery in map space is in a map-based coordinate system. Imagery in pixel space is in raw image space with no rotation and no distortion. The reference system can be specified when generating the training data in the Export Training Data For Deep Learning tool using the Reference System parameter. If the model is trained in a third-party training software, the reference system must be specified in the .emd file using the ImageSpaceUsed parameter, which can be set to MAP_SPACE or PIXEL_SPACE.

  • The input raster can be a single raster, multiple rasters, or a feature class with images attached. For more information about attachments, see Add or remove file attachments.

  • Increasing the batch size can improve tool performance; however, as the batch size increases, more memory is used. If an out of memory error occurs, use a smaller batch size. The batch_size value can be adjusted using the Arguments parameter.

  • Batch sizes are square numbers, such as 1, 4, 9, 16, 25, 64 and so on. If the input value is not a perfect square, the highest possible square value is used. For example, if a value of 6 is specified, the batch size is set to 4.

  • This tool supports and uses multiple GPUs if available. To use a specific GPU, specify the GPU ID environment. When the GPU ID is not set, the tool uses all available GPUs. This is default.

  • For information about requirements for running this tool and issues you may encounter, see Deep Learning frequently asked questions.

  • For more information about deep learning, see Deep learning using the ArcGIS Image Analyst extension.

Parameters

LabelExplanationData Type
Input Raster

The input image that will be used to classify objects.

The input can be a single raster, multiple rasters in a mosaic dataset, an image service, a folder of images, or a feature class with image attachments.

Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Map Server; Map Server Layer; Internet Tiled Layer; Folder; Feature Layer; Feature Class
Output Classified Objects Feature Class

The output feature class that will contain geometries surrounding the objects or feature from the input feature class, as well as a field to store the categorization label.

If the feature class already exists, the results will be appended to the existing feature class.

Feature Class
Model Definition

The Model Definition parameter value can be an Esri model definition JSON file (.emd), a JSON string, or a deep learning model package (.dlpk). A JSON string is useful when this tool is used on the server so you can paste the JSON string rather than upload the .emd file. The .dlpk file must be stored locally.

It contains the path to the deep learning binary model file, the path to the Python raster function to be used, and other parameters such as preferred tile size or padding.

File; String
Input Features
(Optional)

The point, line, or polygon input feature class that identifies the location of each object or feature to be classified and labelled. Each row in the input feature class represents a single object or feature.

If no input feature class is provided, it is assumed that each input image contains a single object to be classified. If the input image or images use a spatial reference, the output from the tool is a feature class in which the extent of each image is used as the bounding geometry for each labelled feature class. If the input image or images are not spatially referenced, the output from the tool is a table containing the image ID values and the class labels for each image.

Feature Class; Feature Layer
Class Label Field
(Optional)

The name of the field that will contain the class or category label in the output feature class.

If no field name is provided, a ClassLabel field will be generated in the output feature class.

String
Processing Mode
(Optional)

Specifies how all raster items in a mosaic dataset or an image service will be processed. This parameter is applied when the input raster is a mosaic dataset or an image service.

  • Process as mosaicked imageAll raster items in the mosaic dataset or image service will be mosaicked together and processed. This is the default.
  • Process all raster items separatelyAll raster items in the mosaic dataset or image service will be processed as separate images.
String
Arguments
(Optional)

The information from the Model Definition parameter will be used to populate this parameter. These arguments vary, depending on the model architecture. The following are supported model arguments for models trained in ArcGIS. ArcGIS pretrained models and custom deep learning models may have additional arguments that the tool supports.

  • batch_size—The number of image tiles processed in each step of the model inference. This depends on the memory of your graphics card. The argument is available for all model architectures.
  • test_time_augmentation—Performs test time augmentation while predicting. If true, predictions of flipped and rotated variants of the input image will be merged into the final output. The argument is available for all model architectures.
  • score_threshold—The predictions above this confidence score are included in the result. The allowed values range from 0 to 1.0. The argument is available for all model architectures.

Value Table
Caption
(Optional)

The name of the field that will contain the text or caption in the output feature class. This parameter is only supported when an Image Captioner model is used.

If no field name is specified, a Caption field will be generated in the output feature class.

Note:

This parameter will not appear in the Geoprocessing pane. To change the default field name, use the Class Label Field parameter.

String

Licensing information

  • Basic: Requires Image Analyst
  • Standard: Requires Image Analyst
  • Advanced: Requires Image Analyst

Related topics