Available with Image Analyst license.
Arguments are one of the many ways to control how deep learning models are trained and used. In this topic, the first table lists the supported model arguments for training deep learning models. The second table lists the arguments to control how deep learning models are used for inferencing.
Training arguments
The Train Deep Learning Model tool includes arguments for training deep learning models. These arguments vary, depending on the model architecture. You can change the values of these arguments to train a model. The arguments are as follows:
- attention_type—Specifies the module type. The default is PAM.
- attn_res—The number of attentions in residual blocks. This is an optional integer value.. This argument is only supported when the Backbone Model parameter value is SR3. The default is 16
- backend—Specifies the backend framework that will be used for this model. To use Tensorflow, switch the processor type to CPU. The default is pytorch.
- bias—The bias for Single Shot Detector (SSD) head. The default is -0.4.
- box_batch_size_per_image—The number of proposals that will be sampled during training of the classification. The default is 512.
- box_bg_iou_thresh—The maximum intersection of union (IoU) between the proposals and the (ground truth) GT box, so that they can be considered as negative during training of the classification head. The default is 0.5.
- box_detections_per_img—The maximum number of detections per image for all classes. The default is 100.
- box_fg_iou_thresh—The minimum IoU between the proposals and the GT box, so that they can be considered as positive during training of the classification head. The default is 0.5.
- box_nms_thresh—The nonmaximum suppression (NMS) threshold for the prediction head used during inferencing. The default is 0.5.
- box_positive_fraction—The proportion of positive proposals in a mini-batch during training of the classification head. The default is 0.25.
- box_score_thresh—The classification score threshold that must be met to return proposals during inferencing. The default is 0.05.
- channel_mults—The optional depth multipliers for subsequent resolutions in U-Net. This argument is only supported when the Backbone Model parameter value is SR3. The default is 1, 2, 4, 4, 8, 8.
- channels_of_interest—A list of spectral bands (channels) of interest. This will filter out bands from rasters of multi-temporal time series based on this list. For example, if there are bands 0-4 in a dataset, but the training is only going to use bands 0,1, and 2, the list will be [0,1,2].
- chip_size—The size of the image that will be used to train the model. Images will be cropped to the specified chip size.
- class_balancing—Specifies whether the cross-entropy loss inverse will be balanced to the frequency of pixels per class. The default is False.
- d_k—The dimension of the key and query vectors. The default is 32.
- decode_params—A dictionary that controls how the Image captioner will run. It is composed of the following parameters: embed_size, hidden_size, attention_size, teacher_forcing, dropout, and pretrained_emb. The teacher_forcing parameter is the probability of teacher forcing. Teacher forcing is a strategy for training recurrent neural networks. It uses model output from a prior time step as an input, instead of the previous output, during back propagation. The pretrained_emb parameter specifies whether pretrained text embedding will be used. If True, it will use fast text embedding. If False, it will not use the pretrained text embedding.
- depth—The depth of the model. The default is 17.
- depths—The number of blocks at each stage. The default is [3, 3, 9, 3].
- dice_loss_average—Specifies whether micro averaging or macro averaging will be used. A macro average will compute the metric independently for each class and take the average, treating all classes equally. A micro average will aggregate the contributions of all classes to compute the average metric. In a multiclass classification setup, micro average is preferable if there may be a class imbalance in which there are many more samples of one class than of other classes. The default is micro.
- dice_loss_fraction—The weight of default loss (or focal loss) compared to dice loss in the total loss to guide training. The default is 0. If focal_loss is set to true, the focal loss is used in place of default loss. If dice_loss_fraction is set to 0, the training will use either default loss (or focal loss) as the total loss to guide training. If the dice_loss fraction value is greater than 0, the training will use the following formula as the total loss to guide training:
=(1 – dice_loss_fraction)*default_loss + dice_loss_fraction*dice_loss - dims—The feature dimension at each stage. The default is [96, 192, 384, 768].
- downsample_factor—The factor that will be used to downsample the images. The default is 4.
- drop—The dropout probability. To reduce overfitting, increase the value. The default is 0.3.
- dropout—The dropout probability. To reduce overfitting, increase the value. This argument is only supported when the Backbone Model parameter value is SR3. The default depends on the Model Type parameter value.
- drop_path_rate— The Stochastic depth rate. The default is 0.0.
- embed_dim—The dimension of embeddings. The default is 768.
- feat_loss—Specifies whether discriminator feature matching loss will be used. The default is True.
- focal_loss—Specifies whether focal loss will be used. Focal loss can handle the class imbalance problem with the single-stage object detection model. The default is False.
- gaussian_thresh—The Gaussian threshold, which sets the required road width. The valid range is 0.0 to 1.0. The default is 0.76.
- gen_blocks—The number of ResNet blocks that will be used in the generator. The default is 9.
- gen_network—Specifies the model that will be used for the generator. Use global if the machine's GPU memory is low. The default is local.
- grids—The number of grids the image will be divided into for processing. For example, setting this argument to 4 means the image will be divided into 4 x 4 or 16 grid cells. If no value is specified, the optimal grid value will be calculated based on the input imagery.
- head_init_scale—Initial scaling value for classifier weights and biases. Th default is 1.0.
- ignore_classes—The list of class values on which the model will not incur loss.
- inner_channel—The dimension of the first U-net layer. This is an optional integer value. This argument is only supported when the Backbone Model parameter value is SR3. The default is 64.
- keep_dilation—Specifies whether dilation will be used. When set to True and the pointrend architecture is used, it can potentially improve the accuracy at the expense of memory consumption. The default is False.
- lambda_feat—The weight for feature matching loss. The default is 10.
- lambda_l1—The weight for feature matching loss. This is not supported for 3-band imagery. The default is 100.
- layer_scale_init_value—The initial value for layer scale. The default is 1e-6..
- linear_end—An optional integer to schedule the end. This argument is only supported when the Backbone Model parameter value is SR3. The default is 1e-06.
- linear_start—An optional integer to schedule the start. This argument is only supported when the Backbone Model parameter value is SR3. The default is 1e-02.
- lsgan—Specifies whether mean squared error will be used in the training. If False, it will use binary cross entropy instead. The default is True.
- location_loss_factor—The weight of the bounding box loss. This factor adjusts the focus of the model on the location of the bounding box. When this is set to None, it gives equal weight to both location and classification loss.
- min_points—The number of pixels that will be sampled from each masked region of training. This value must be a multiple of 64.
- mixup—Specifies whether new training images will be created by randomly mixing training set images (True). The default is False.
- mlp_ratio—The ratio of multilayer perceptron (MLP). The default is 4.
- mlp1—The dimensions of the successive feature spaces of MLP1. The default is 32,64.
- mlp2—The dimensions of the successive feature spaces of MLP2. The default is 128,128.
- mlp4—The dimensions of decoder MLP. The default is 64,32.
- model—The backbone model that will be used to train the model. The available backbones depend on the Model Type parameter value. This argument is only supported for the MMDetection and MMSegmentation model types. The default for MMDetection is cascade_rcnn. The default for MMSegmentation is mask2former.
- model_weight—Specifies whether pretrained model weights will be used. The value can also be a path to a configuration file containing the weights of a model from the MMDetection repository or the MMSegmentation repository. The default is False.
- monitor—Specifies the metric that will be monitored while checkpointing and early stopping. The available metrics depend on the Model Type parameter value. The default is valid_loss.
- mtl_model—Specifies the architecture type that will be used to create the model. The options are linknet or hourglass for linknet-based or hourglass-based, respectively, neural architectures. The default is hourglass.
- n_blocks_global—The number of residual blocks in the global generator network. The default is 9.
- n_blocks_local—The number of residual blocks in the local enhancer network. The default is 3.
- n_downsample_global—The number of downsampling layers in the global generator network.
- n_dscr—The number of discriminators that will be used. The default is 2.
- n_dscr_filters—The number of discriminator filters in the first convolution layer. The default is 64.
- n_gen_filters—The number of gen filters in the first convolution layer. The default is 64.
- n_head—The number of attention heads. The default is 4.
- n_layers_dscr—The number of layers for the Discriminator Network that will be used in Pix2PixHD. The default is 3.
- n_local_enhancers—The number of local enhancers that will be used. The default is 1.
- n_masks—The maximum number of class labels and instances any image can contain. The default is 30.
- n_timestep—The number of diffusion time steps. This is an optional value. This argument is only supported when the Backbone Model parameter value is SR3. The default is 1000.
- norm—Specifies whether instance normalization or batch normalization will be used. The default is instance.
- norm_groups—The number of groups for group normalization. This is an optional integer value. This argument is only supported when the Backbone Model parameter value is SR3. The default is 32.
- num_heads—The number of attention heads. The default is 12.
- orient_bin_size—The bin size for orientation angles. The default is 20.
- orient_theta—The width of the orientation mask. The default is 8.
- oversample—Specifies whether over sampling will be used. If set to True, unbalanced classes of the dataset will be over sampled during training. This is not supported with MultiLabel datasets. The default is False.
- patch_size—The patch size that will be used for generating patch embeddings The default is 16.
- perceptual_loss—Specifies whether perceptual loss will be used in the training. The default is False.
- pointrend—Specifies whether the PointRend architecture will be used on top of the segmentation head. For more information about the PointRend architecture, see the PointRend PDF. The default is False.
- pooling—The pixel-embedding pooling strategy that will be used. The default is mean.
- pyramid_sizes—The number and size of convolution layers that will be applied to the different subregions. This argument is specific to the Pyramid Scene Parsing Network model. The default is [1,2,3,6].
- qkv_bias—Specifies whether QK Vector bias will be used in the training. The default is False.
- ratios—The list of aspect ratios that will be used for the anchor boxes. In object detection, an anchor box represents the ideal location, shape, and size of the object being predicted. For example, setting this argument to [1.0,1.0], [1.0, 0.5] means the anchor box is a square (1:1) or a rectangle in which the horizontal side is half the size of the vertical side (1:0.5). The default for RetinaNet is [0.5,1,2]. The default for Single Shot Detector is [1.0, 1.0].
- res_blocks—The number of residual blocks. This is an optional integer value. This argument is only supported when the Backbone Model parameter value is SR3. The default is 3.
- rpn_batch_size_per_image—The number of anchors that will be sampled during training of the RPN for computing the loss. The default is 256.
- rpn_bg_iou_thresh—The maximum IoU between the anchor and the GT box so that they can be considered as negative during training of the RPN. The default is 0.3.
- rpn_fg_iou_thresh—The minimum IoU between the anchor and the GT box so that they can be considered as positive during training of the RPN. The default is 0.7.
- rpn_nms_thresh—The NMS threshold that will be used for postprocessing the RPN proposals. The default is 0.7.
- rpn_positive_fraction—The proportion of positive anchors in a mini-batch during training of the RPN. The default is 0.5.
- rpn_post_nms_top_n_test—The number of proposals that will be kept after applying NMS during testing. The default is 1000.
- rpn_post_nms_top_n_train—The number of proposals that will be kept after applying NMS during training. The default is 2000.
- rpn_pre_nms_top_n_test—The number of proposals that will be kept before applying NMS during testing. The default is 1000.
- rpn_pre_nms_top_n_train—The number of proposals that will be kept before applying NMS during training. The default is 2000.
- scales—The number of scale levels each cell will be scaled up or down. The default is [1, 0.8, 0.63].
- schedule—The type of schedule that will be used. This is an optional value. The options are linear, warmup10, warmup50, const, jsd, and cosine. This argument is only supported when the Backbone Model parameter value is SR3. The default is linear.
- T—The period that will be used for the positional encoding. The default is 1000.
- timesteps_of_interest—The list of time steps of interest; this will filter multitemporal time series based on the list of time steps specified. For example, if the dataset has time steps 0, 1, 2, and 3, but only time steps 0, 1, and 2 are used in the training, this parameter would be set to [0,1,2]; the rest of the time-steps will be filtered out.
- use_net—Specifies whether the U-Net decoder will be used to recover data once the pyramid pooling is complete. This argument is specific to the Pyramid Scene Parsing Network model. The default is True.
- vgg_loss—Specifies whether VGG feature matching loss will be used. This is only supported for 3-band imagery. The default is True.
- zooms—The number of zoom levels each grid cell will be scaled up or down. Setting this argument to 1 means all the grid cells will remain at the same size or zoom level. A zoom level of 2 means all the grid cells will become twice as large (zoomed in 100 percent). Providing a list of zoom levels means all the grid cells will be scaled using all the numbers in the list. The default is 1.
| Model type | Argument | Valid values |
|---|---|---|
3D-RCNet (pixel classification) | depths | A list of positive integers. The default is [3, 3, 9, 3]. |
dims | A list of positive integers. The default is [96, 192, 384, 768]. | |
drop_path_rate | Floating point values. The default is 0.0. | |
head_init_scale | Floating point values. The default is 1.0. | |
layer_scale_init_value | Floating point values. The default is 1e-6. | |
Change detector (pixel classification) | attention_type | PAM (Pyramid Attention Module) or BAM (Basic Attention Module). The default is PAM. |
chip_size | Integers between 0 and image size. | |
monitor | valid_loss, precision, recall, and f1. The default is valid_loss. | |
ConnectNet (pixel classification) | chip_size | Integers between 0 and image size. |
gaussian_thresh | 0.0 to 1.0. The default is 0.76. | |
monitor | valid_loss, accuracy, miou, and dice. The default is valid_loss. | |
mtl_model | linknet or hourglass. The default is hourglass. | |
orient_bin_size | A positive number. The default is 20. | |
orient_theta | A positive number. The default is 8. | |
CycleGAN (image translation) | gen_blocks | A positive integer. The default is 9. |
lsgan | true or false. The default is true. | |
DeepLabv (pixel classification) | chip_size | Integers between 0 and image size. |
class_balancing | true or false. The default is false. | |
dice_loss_average | micro or macro. The default is micro. | |
dice_loss_fraction | Floating point value between 0 to 1. The default is 0. | |
focal_loss | true or false. The default is false. | |
ignore_classes | Valid class values. | |
keep_dilation | true or false. The default is false. | |
mixup | true or false. The default is false. | |
monitor | valid_loss and accuracy. The default is valid_loss. | |
pointrend | true or false. The default is false. | |
FasterRCNN (Object detection) | box_batch_size_per_image | Positive integers. The default is 512. |
box_bg_iou_thresh | Floating point value between 0 to 1. The default is 0.5. | |
box_detections_per_img | Positive integers. The default is 100. | |
box_fg_iou_thresh | Floating point value between 0 to 1. The default is 0.5. | |
box_nms_thresh | Floating point value between 0 to 1. The default is 0.5. | |
box_positive_fraction | Floating point value between 0 to 1. The default is 0.25. | |
box_score_thresh | Floating point value between 0 to 1. The default is 0.05. | |
rpn_batch_size_per_image | Positive integers. The default is 256. | |
rpn_bg_iou_thresh | Floating point value between 0 to 1. The default is 0.3. | |
rpn_fg_iou_thresh | Floating point value between 0 to 1. The default is 0.7. | |
rpn_nms_thresh | Floating point value between 0 to 1. The default is 0.7. | |
rpn_positive_fraction | Floating point value between 0 to 1. The default is 0.5. | |
rpn_post_nms_top_n_test | Positive integers. The default is 1000. | |
rpn_post_nms_top_n_train | Positive integers. The default is 2000. | |
rpn_pre_nms_top_n_test | Positive integers. The default is 1000. | |
rpn_pre_nms_top_n_train | Positive integers. The default is 2000. | |
Feature Classifier (Object classification) | backend | pytorch or tensorflow. The default is pytorch. |
mixup | true or false. The default is false. | |
oversample | true or false. The default is false. | |
Image captioner (image translation) | chip_size | Integers between 0 and image size. |
The decode_params argument is composed of the following parameters:
| The default is {'embed_size':100, 'hidden_size':100, 'attention_size':100, 'teacher_forcing':1, 'dropout':0.1, 'pretrained_emb':False}. | |
monitor | valid_loss, accuracy, corpus_bleu, and multi_label_fbeta. The default is valid_loss. | |
MaskRCNN (Object detection) | box_batch_size_per_image | Positive integers. The default is 512. |
box_bg_iou_thresh | Floating point value between 0 to 1. The default is 0.5. | |
box_detections_per_img | Positive integers. The default is 100. | |
box_fg_iou_thresh | Floating point value between 0 to 1. The default is 0.5. | |
box_nms_thresh | Floating point value between 0 to 1. The default is 0.5. | |
box_positive_fraction | Floating point value between 0 to 1. The default is 0.25. | |
box_score_thresh | Floating point value between 0 to 1. The default is 0.05. | |
rpn_batch_size_per_image | Positive integers. The default is 256. | |
rpn_bg_iou_thresh | Floating point value between 0 to 1. The default is 0.3. | |
rpn_fg_iou_thresh | Floating point value between 0 to 1. The default is 0.7. | |
rpn_nms_thresh | Floating point value between 0 to 1. The default is 0.7. | |
rpn_positive_fraction | Floating point value between 0 to 1. The default is 0.5. | |
rpn_post_nms_top_n_test | Positive integers. The default is 1000. | |
rpn_post_nms_top_n_train | Positive integers. The default is 2000. | |
rpn_pre_nms_top_n_test | Positive integers. The default is 1000. | |
rpn_pre_nms_top_n_train | Positive integers. The default is 2000. | |
MaXDeepLab (panoptic segmentation) | n_masks | Positive integers. The default is 30. |
MMDetection (object detection) | chip_size | Integers between 0 and image size. |
model | atss, carafe, cascade_rcnn, cascade_rpn, dcn, deeplabv3, detectors, dino, double_heads, dynamic_rcnn, empirical_attention, fcos, foveabox, fsaf, ghm, hrnet, libra_rcnn, nas_fcos, pafpn, pisa, regnet, reppoints, res2net, sabl, and vfnet. The default is deeplabv3. | |
model_weight | true or false. | |
MMSegmentation (pixel classification) | chip_size | Integers between 0 and image size. |
model | ann, apcnet, ccnet, cgnet, deeplabv3, deeplabv3plus, dmnet , dnlnet, emanet, fastscnn, fcn, gcnet, hrnet, mask2former, mobilenet_v2, nonlocal_net, ocrnet, prithvi100m, psanet, pspnet, resnest, sem_fpn, unet, and upernet. The default is mask2former. | |
model_weight | true or false. The default is false. | |
Multi Task Road Extractor (pixel classification) | chip_size | Integers between 0 and image size. |
gaussian_thresh | 0.0 to 1.0. The default is 0.76. | |
monitor | valid_loss, accuracy, miou, and dice. The default is valid_loss. | |
mtl_model | linknet or hourglass. The default is hourglass. | |
orient_bin_size | A positive number. The default is 20. | |
orient_theta | A positive number. The default is 8. | |
Pix2Pix (image translation) | perceptual_loss | true or false. The default is false. |
Pix2PixHD (image translation) | feat_loss | true or false. The default is true. |
gen_network | local or global. The default is local. | |
lambda_feat | Positive integer values. The default is 10. | |
lambda_l1 | Positive integer values. The default is 100. | |
lsgan | true or false. The default is true. | |
n_blocks_global | Positive integer values. The default is 9. | |
n_blocks_local | Positive integer values. The default is 3. | |
n_downsample_global | Positive integer values. The default is 4. | |
n_dscr | Positive integer values. The default is 2. | |
n_dscr_filters | Positive integer values. The default is 64. | |
n_gen_filters | Positive integer values. The default is 64. | |
n_layers_dscr | Positive integer values. The default is 3. | |
n_local_enhancers | Positive integer values. The default is 1. | |
norm | instance or batch. The default is instance. | |
vgg_loss | true or false. The default is true. | |
PSETAE (pixel classification) | channels_of_interest | List of band numbers (positive integers). |
d_k | Positive integer values. The default is 32. | |
dropout | Floating point value between 0 to 1. The default is 0.2. | |
min_points | Integer multiples of 64. | |
mlp1 | List of positive integers. The default is 32,64. | |
mlp2 | List of positive integers. The default is 128,128. | |
mlp4 | List of positive integers. The default is 64,32. | |
n_head | Positive integer values. The default is 4. | |
pooling | mean, std, max, or min. The default is mean. | |
T | Positive integer values. The default is 1000. | |
timesteps_of_interest | List of positive integers. | |
Pyramid Scene Parsing Network (pixel classification) | chip_size | Integers between 0 and image size. |
class_balancing | true or false. The default is false. | |
dice_loss_average | micro or macro. The default is micro. | |
dice_loss_fraction | Floating point value between 0 to 1. The default is 0. | |
focal_loss | true or false. The default is false. | |
ignore_classes | Valid class values. | |
keep_dilation | true or false. The default is false. | |
mixup | true or false. The default is false. | |
monitor | valid_loss or accuracy. The default is valid_loss. | |
pointrend | true or false. The default is false. | |
pyramid_sizes | [convolution layer 1, convolution layer 2, ... , convolution layer n] | |
use_net | true or false. The default is true. | |
RetinaNet (object detection) | chip_size | Integers between 0 and image size. |
monitor | valid_loss or average_precision. The default is valid_loss. | |
ratios | Ratio value 1, ratio vale 2, ratio value 3. The default is 0.5,1,2. | |
scales | [scale value 1, scale value 2, scale value 3] The default is [1, 0.8, 0.63]. | |
SAMLoRA (pixel classification) | class_balancing | true or false. The default is false. |
ignore_classes | Valid class values. | |
Single Shot Detector (object detection) | backend | pytorch or tensorflow. The default is pytorch. |
bias | Floating point value. The default is -0.4. | |
chip_size | Integers between 0 and image size. The default is 0.3. | |
drop | Floating point value between 0 to 1. | |
focal_loss | true or false. The default is false. | |
grids | Integer values greater than 0. | |
location_loss_factor | Floating point value between 0 to 1. | |
monitor | valid_loss or average_precision. The default is valid_loss. | |
ratios | [horizontal value, vertical value] | |
zooms | The zoom value in which 1.0 is normal zoom. | |
Super Resolution with SR3 backbone (image translation) | attn_res | Integers greater than 0. The default is 16. |
channel_mults | Integer multiplier sets. The default is [1, 2, 4, 4, 8, 8]. | |
downsample_factor | Positive integer value. The default is 4. | |
dropout | Floating point value. The default is 0. | |
inner_channel | Integer value greater than 0. The default is 64. | |
linear_start | Time integer. The default is 1e-02. | |
linear_end | Time integer. The default is 1e-06. | |
n_timestep | Integer value greater than 0. The default is 1000. | |
norm_groups | Integer value greater than 0. The default is 32. | |
res_blocks | Integer value greater than 0. The default is 3. | |
schedule | linear, warmup10, warmup50, const, jsd, or cosine. The default is linear. | |
Super Resolution with SR3_UViT backbone (image translation) | depth | Positive integer point value. The default is 17. |
embed_dim | Positive integer point value. The default is 768. | |
mlp_ratio | Positive floating point value. The default is 4.0. | |
num_heads | Positive integer point value. The default is 12. | |
patch_size | Positive integer point value. The default is 16. | |
qkv_bias | true or false. The default is false. | |
U-Net (pixel classification) | chip_size | Integers between 0 and image size. |
class_balancing | true or false. The default is false. | |
dice_loss_average | micro or macro. The default is micro. | |
dice_loss_fraction | Floating point value between 0 to 1. The default is 0. | |
focal_loss | true or false. The default is false. | |
ignore_classes | Valid class values. | |
mixup | true or false. The default is false. | |
monitor | valid_loss or accuracy.The default is valid_loss. |
Inferencing arguments
The following arguments are available to control how deep learning models are trained for inferencing. The information from the Model Definition parameter will be used to populate the Arguments parameter in the inferencing tools. These arguments vary, depending on the model architecture. ArcGIS pretrained models and custom deep learning models may have additional arguments that the tool supports.
| Argument | Inference type | Valid values | |
|---|---|---|---|
| batch_size | The number of image tiles that will be processed in each step of the model inference. This depends on the memory of your graphics card. The argument is available for all model architectures. | Classify Objects Classify Pixels Detect Change Detect Objects | Integer values greater than 0; it's usually an integer that is a power of 2n. |
| direction | The direction the image is translated from one domain to another. For more information about this argument, see How CycleGAN works. The argument is only available for the CycleGAN architecture. | Classify Pixels | Available options are AtoB and BtoA. |
| exclude_pad_detections | Specifies whether truncated detections near the edges that are in the padded region of image chips will be filtered. The argument is available for SSD, RetinaNet, YOLOv3, DETReg, MMDetection, and Faster RCNN only. | Detect Objects | true or false. |
| merge_policy | The policy that will be used for merging augmented predictions. This is only applicable when test time augmentation is used. For the Classify Pixels Using Deep Learning tool, the argument is available for the MultiTaskRoadExtractor and ConnectNet architectures. If IsEdgeDetection is present in the model's .emd file, BDCNEdgeDetector, HEDEdgeDetector, and MMSegmentation are also available architectures. For the Detect Objects Using Deep Learning tool, the argument is only available for MaskRCNN. | Classify Pixels Detect Objects | Available options are mean, max, and min. |
| nms_overlap | The maximum overlap ratio for two overlapping features, which is defined as the ratio of intersection area over union area. The argument is available for all model architectures. | Detect Objects | A floating point value of 0.0 to 1.0 . The default is 0.1. |
output_classified_raster | The path to the output raster. The argument is only available for MaXDeepLab. | Detect Objects | The file path and name for the output classified raster. |
padding | The number of pixels at the border of image tiles from which predictions will be blended for adjacent tiles. To smooth the output while reducing artifacts, increase the value. The maximum value of the padding can be half the tile size value. The argument is available for all model architectures. | Classify Pixels Detect Change Detect Objects | Integer values greater than 0 and less than half the tile size value. |
| predict_background | Specifies whether the background class will be classified. The argument is available for UNET, PSPNET, DeepLab, and MMSegmentation. | Classify Pixels | true or false. |
return_probability_raster | Specifies wether a probability raster will be output. A probability raster is a raster whose pixels specify the probability that the variable of interest is above or below a specified threshold value. If ArcGISLearnVersion is 1.8.4 or later in the model's .emd file, the MultiTaskRoadExtractor and ConnectNet architectures are available. If ArcGISLearnVersion is 1.8.4 or later and IsEdgeDetection is present in the model's .emd file, the BDCNEdgeDetector, HEDEdgeDetector, and MMSegmentation architectures are also available. | Classify Pixels | true or false. |
| score_threshold | The predictions above this confidence score are included in the result. The argument is available for all model architectures. This argument tends to be used in models trained earlier than ArcGIS Pro 3.5 | Classify Objects | 0 to 1.0 |
| test_time_augmentation | Specifies whether test time augmentation will be performed while predicting. If true, predictions of flipped and rotated variants of the input image will be merged into the final output. The argument is available for most model architectures. | Classify Objects Classify Pixels | true or false. |
| threshold | The predictions that have a confidence score higher than this threshold are included in the result. For the Classify Objects Using Deep Learning tool, the argument is available for all model architectures. This argument tends to be used in models trained in ArcGIS Pro 3.5 and later. For the Classify Pixels Using Deep Learning tool, if ArcGISLearnVersion is 1.8.4 or later in the model's .emd file, the MultiTaskRoadExtractor and ConnectNet architectures are available. If ArcGISLearnVersion is 1.8.4 or later and IsEdgeDetection is present in the model's .emd file, the BDCNEdgeDetector, HEDEdgeDetector, and MMSegmentation architectures are also available. For the Detect Objects Using Deep Learning tool, the argument is available for all model architectures. | Classify Objects Classify Pixels Detect Objects | 0 to 1.0. |
| thinning | Specifies whether the predicted edges will be thinned or skeletonized. If IsEdgeDetection is present in the model's .emd file, BDCNEdgeDetector, HEDEdgeDetector, and MMSegmentation are available architectures. | Classify Pixels | true or false. |
| tile_size | The width and height of image tiles into which the imagery will be split for prediction. For the Classify Pixels Using Deep Learning tool, the argument is only available for the CycleGAN architecture. For the Detect Objects Using Deep Learning tool, the argument is only available for MaskRCNN. | Classify Pixels Detect Objects | Integer values greater than 0 and less than the size of the image. |