Skip To Content

Train Deep Learning Model (Image Analyst)

Mit der Image Analyst-Lizenz verfügbar.

Zusammenfassung

Trains a deep learning model using the output from the Export Training Data For Deep Learning tool.

Verwendung

  • To set up your machine to use deep learning frameworks in ArcGIS Pro, see Install deep learning frameworks.

  • This tool trains using the PyTorch deep learning framework. PyTorch must be installed on your machine to successfully run this tool.

  • The input training data for this tool must include the images and labels folders that are generated from the Export Training Data For Deep Learning tool.

  • For more information about deep learning, see Deep learning in ArcGIS Pro.

Syntax

TrainDeepLearningModel(in_folder, out_folder, {max_epochs}, {model_type}, {batch_size}, {arguments}, {learning_rate}, {backbone_model}, {pretrained_model}, {validation_percentage}, {stop_training})
ParameterErklärungDatentyp
in_folder

The folder containing the image chips, labels, and statistics required to train the model. This is the output from the Export Training Data For Deep Learning tool.

To train a model, the input images must be 8-bit rasters with 3 bands.

Folder
out_folder

The output folder location that will store the trained model.

Folder
max_epochs
(optional)

The maximum number of epochs for which the model should be trained. A maximum epoch of one means the dataset will be passed forward and backward through the neural network one time. The default value is 20.

Long
model_type
(optional)

Specifies the model type to use for training the deep learning model.

  • SSDThe Single Shot Detector (SSD) approach will be used to train the model. SSD is used for object detection. The input training data for this model type uses the Pascal Visual Object Classes metadata format.
  • UNETThe U-Net approach will be used to train the model. U-Net is used for pixel classification.
  • FEATURE_CLASSIFIER The Feature Classifier approach will be used to train the model. This is used for object or image classification.
  • PSPNETThe Pyramid Scene Parsing Network (PSPNET) approach will be used to train the model. PSPNET is used for pixel classification.
  • RETINANETThe RetinaNet approach will be used to train the model. RetinaNet is used for object detection. The input training data for this model type uses the Pascal Visual Object Classes metadata format.
  • MASKRCNNThe MaskRCNN approach will be used to train the model. MaskRCNN is used for object detection. It is used for instance segmentation, which is the automatic delineation of objects in an image, such as detecting building footprints or swimming pools. It uses the MaskRCNN metadata format for training data as input. Class values for input training data must start at 1. This model type can only be trained using a CUDA-enabled GPU.
String
batch_size
(optional)

The number of training samples to be processed for training at one time. The default value is 2.

If you have a powerful GPU, this number can be increased to 8, 16, 32, or 64.

Long
arguments
[arguments,...]
(optional)

The function arguments are defined in the Python raster function class. This is where you list additional deep learning parameters and arguments for experiments and refinement, such as a confidence threshold for adjusting the sensitivity. The names of the arguments are populated by the tool from reading the Python module.

When you choose SSD as the model_type parameter value, the arguments parameter will be populated with the following arguments:

  • grids—The number of grids the image will be divided into for processing. Setting this argument to [4] means the image will be divided into 4 x 4 or 16 grid cells. The default is [4, 2, 1], meaning there will be 21 grid cells ([4 x 4] + [2 x 2] + [1 x 1] = 21).
  • zooms—The number of zoom levels each grid cell will be scaled up or down. Setting this argument to [1] means all the grid cells will remain at the same size or zoom level. A zoom level of [2] means all the grid cells will become twice as large (zoomed in 100 percent). Providing a list of zoom levels means all the grid cells will be scaled using all the numbers in the list. The default is [0.7, 1.0, 1.3].
  • ratios—The list of aspect ratios to use for the anchor boxes. In object detection, an anchor box represents the ideal location, shape, and size of the object being predicted. Setting this argument to [[1.0,1.0], [1.0, 0.5]] means the anchor box is a square (1:1) or a rectangle in which the horizontal side is half the size of the vertical side (1:0.5). The default is [[1, 1], [1, 0.5], [0.5, 1]].

When you choose PSPNET as the model_type parameter value, the arguments parameter will be populated with the following arguments:

  • USE_UNET_DECODER—The U-Net decoder will be used to recover data once the pyramid pooling is complete. The default is True.
  • PYRAMID_SIZE—The number and size of convolution layers to be applied to the different subregions. The default is [1,2,3,6].

When you choose RETINANET as the model_type parameter value, the arguments parameter will be populated with the following arguments:

  • SCALES—The number of scale levels each cell will be scaled up or down. The default is [1, 0.8, 0.63].
  • RATIOS—The aspect ratio of the anchor box. The default is [0.5,1,2].

Value Table
learning_rate
(optional)

The rate at which existing information will be overwritten with newly acquired information throughout the training process. If nothing is specified, the optimal learning rate will be extracted from the learning curve during the training process.

Double
backbone_model
(optional)

Specifies the preconfigured neural network to be used as an architecture for training the new model. This method is known as Transfer Learning. The default is ResNet 34 (RESNET34 in Python).

  • RESNET34The preconfigured model will be a residual network trained on the ImageNET Dataset that contains more than a million images and is 34 layers deep. This is the default.
  • RESNET50The preconfigured model will be a residual network trained on the ImageNET Dataset that contains more than a million images and is 50 layers deep.
String
pretrained_model
(optional)

The pretrained model to be used for fine tune training the new model. The input is an Esri Model Definition file (.emd).

A pretrained model with similar classes can be fine-tuned to fit the new model. For example, an existing model that has been trained for cars can be fine-tuned to train a model that identifies trucks.

The pretrained model must have been trained with the same model type and backbone model that will be used to train the new model.

File
validation_percentage
(optional)

The percentage of training samples that will be used for validating the model. The default value is 10.

Double
stop_training
(optional)

Specifies whether early stopping will be implemented.

  • STOP_TRAININGThe model training will stop when the model is no longer improving, regardless of the max_epochs value specified. This is the default.
  • CONTINUE_TRAININGThe model training will continue until the max_epochs value is reached.
Boolean

Abgeleitete Ausgabe

NameErklärungDatentyp
out_model_file

The output trained model file.

File

Codebeispiel

TrainDeepLearningModel example 1 (Python window)

This example trains a tree classification model using the U-Net approach.

# Import system modules  
import arcpy  
from arcpy.ia import *  
 
# Check out the ArcGIS Image Analyst extension license 
arcpy.CheckOutExtension("ImageAnalyst") 
 
# Execute 
TrainDeepLearningModel(r"C:\DeepLearning\TrainingData\Roads_FC", 
     r"C:\DeepLearning\Models\Fire", 40, "UNET", 16, "# #", None, 
     "RESNET34", None, 10, "STOP_TRAINING")
TrainDeepLearningModel example 2 (stand-alone script)

This example trains an object detection model using the SSD approach.

# Import system modules  
import arcpy  
from arcpy.ia import *  
 
# Check out the ArcGIS Image Analyst extension license 
arcpy.CheckOutExtension("ImageAnalyst") 
 
#Define input parameters
in_folder = "C:\\DeepLearning\\TrainingData\\Cars" 
out_folder = "C:\\Models\\Cars"
max_epochs = 100
model_type = "SSD"
batch_size = 2
arg = "grids '[4, 2, 1]';zooms '[0.7, 1.0, 1.3]';ratios '[[1, 1], [1, 0.5], [0.5, 1]]'"
learning_rate = 0.003
backbone_model = "RESNET34" 
pretrained_model = "C:\\Models\\Pretrained\\vehicles.emd"
validation_percent = 10
stop_training = "STOP_TRAINING"


# Execute
TrainDeepLearningModel(in_folder, out_folder, max_epochs, model_type, 
     batch_size, arg, learning_rate, backbone_model, pretrained_model, 
     validation_percent, stop_training)

Lizenzinformationen

  • Basic: Erfordert Image Analyst
  • Standard: Erfordert Image Analyst
  • Advanced: Erfordert Image Analyst

Verwandte Themen