Train Point Cloud Object Detection Model (3D Analyst)

Summary

Trains an object detection model for point clouds using deep learning.

Usage

  • This tool requires the installation of Deep Learning Essentials, which provides multiple neural network solutions that include neural architectures for classifying point clouds.

    To set up your machine to use deep learning frameworks in ArcGIS Pro, see Install deep learning frameworks for ArcGIS.

  • Using a pretrained model in the training process is helpful, especially when you have limitations in data, time, or computational resources. Pretrained models reduce the need for extensive training and provide a reliable starting point for quickly creating a useful model. To use a pretrained model, the new training data must be compatible with the pretrained model. This means that the new training data must have the same attributes and object codes as the training data that was used to create the pretrained model. If object codes in the training data do not match the codes in the pretrained model, the training data's object codes must be remapped accordingly.

  • The point cloud object detection model can only be trained using a CUDA-capable NVIDIA graphics card. When the Processor Type environment is not set to a computer with CUDA-capable graphics cards, the card with the most optimal hardware will be used for training. Otherwise, a specific graphics card can be assigned in the GPU ID environment setting.

  • The following metrics will be reported during the training process:

    • Epoch—The epoch number with which the result is associated
    • Training Loss—The result of the entropy loss function that was averaged for the training data
    • Validation Loss—The result of the entropy loss function that was determined when applying the model trained in the epoch on the validation data
    • Average Precision—The ratio of points in the validation data that were correctly classified by the model trained in the epoch (true positives) over all the points in the validation data

    A model that achieves low training loss but high validation loss is considered to be overfitting the training data, whereby it detects patterns from artifacts in the training data that result in the model not working well for the validation data. A model that achieves a high training loss and a high validation loss is considered to be underfitting the training data in which no patterns are learned effectively enough to produce a usable model.

    Learn more about assessing point cloud training results

  • A folder is created to store the checkpoint models, which are models that are created at the end of each epoch. The checkpoints folder name begins with the same name as the model and ends with a suffix of .checkpoints. It is stored in the Output Model Location parameter value.

Parameters

LabelExplanationData Type
Input Training Data

The point cloud object detection training data (*.pcotd file) that will be used to train the model.

File
Output Model Location

An existing folder that will store the new directory containing the deep learning model.

Folder
Output Model Name

The name of the output Esri model definition file (*.emd), deep learning package (*.dlpk), and the directory that will be created to store them.

String
Pre-trained Model Definition File
(Optional)

The pretrained object detection model that will be refined. When a pretrained model is provided, the input training data must have the same attributes and maximum number of points that were used by the training data that generated the model.

File
Architecture
(Optional)

Specifies the architecture that will be used to train the model.

  • Sparsely Embedded Convolutional DetectionThe Sparsely Embedded CONvolutional Detection (SECOND) architecture will be used. This is the default.
String
Attribute Selection
(Optional)

Specifies the point attributes that will be used with the classification code when training the model. Only the attributes that are present in the point cloud training data will be available. No additional attributes are included by default.

  • IntensityThe measure of the magnitude of the lidar pulse return will be used.
  • Return NumberThe ordinal position of the point obtained from a given lidar pulse will be used.
  • Number of ReturnsThe total number of lidar returns that were identified as points from the pulse associated with a given point will be used.
  • Red BandThe red band's value from a point cloud with color information will be used.
  • Green BandThe green band's value from a point cloud with color information will be used.
  • Blue BandThe blue band's value from a point cloud with color information will be used.
  • Near Infrared BandThe near infrared band's value from a point cloud with near infrared information will be used.
  • Relative HeightThe relative height of each point in relation to a reference surface, which would typically be a bare earth DEM, will be used.
String
Minimum Points Per Block
(Optional)

The minimum number of points that must be present in a given block for it to be used when training the model. The default is 0.

Long
Remap Object Codes
(Optional)

Defines how object codes will be remapped to new values before training the deep learning model.

  • Current Code—The object code value in the training data
  • Remapped Code—The object code value that the existing code will be changed to

Value Table
Object Codes of Interest
(Optional)

The object codes that will be used to filter the objects in the training data. When object codes are provided, the objects that are not included will be ignored.

Long
Only train blocks that contain objects
(Optional)

Specifies whether the model will be trained using only blocks that contain objects or all blocks, including those that do not contain objects.

  • Checked—The model will be trained using only blocks that contain objects. The data used for validation will not be modified.
  • Unchecked—The model will be trained using all blocks, including those that do not contain objects. This is the default.
Boolean
Object Descriptions
(Optional)

The descriptions for each object code in the training data.

  • Object Code—The object code value that was learned by the model
  • Description—The object described by the class code

Value Table
Model Selection Criteria
(Optional)

Specifies the statistical basis that will be used to determine the final model.

  • Validation LossThe model that achieves the lowest result when the entropy loss function is applied to the validation data will be used.
  • Average PrecisionThe model that achieves the highest ratio of points in the validation data that were correctly classified by the model trained in the epoch (true positives) over all the points in the validation data will be used. This is the default.
String
Maximum Number of Epochs
(Optional)

The number of times each block of data will be passed forward and backward through the neural network. The default is 25.

Long
Learning Rate Strategy
(Optional)

Specifies how the learning rate will be modified during training.

  • One Cycle Learning RateThe learning rate will be cycled throughout each epoch using Fast.AI's implementation of the 1cycle technique for training neural networks to help improve the training of a convolutional neural network. This is the default.
  • Fixed Learning RateThe same learning rate will be used throughout the training process.
String
Learning Rate
(Optional)

The rate at which existing information will be overwritten with new information. If no value is provided, the optimal learning rate will be extracted from the learning curve during the training process. This is the default.

Double
Batch Size
(Optional)

The number of training data blocks that will be processed at any given time. The default is 2.

Long
Stop training when model no longer improves
(Optional)

Specifies whether the model training will stop when the metric specified in the Model Selection Criteria parameter does not register any improvement after five consecutive epochs.

  • Checked—The model training will stop when the model is no longer improving.
  • Unchecked—The model training will continue until the maximum number of epochs has been reached. This is the default.
Boolean
Architecture Settings
(Optional)

The architecture settings that can be modified to improve training results.

  • Option—The architecture-specific options that can be modified.
    • Voxel Width—The x- and y-dimensions of the voxel used during training. The corresponding value is in linear units of meters and can be expressed as a double value.
    • Voxel Height—The z-dimension of the voxel used during training. The corresponding value is in linear units of meters and can be expressed as a double value.
    • Voxel Point Limit—The number of points in a given voxel. The corresponding value must be a positive integer. When no value is provided, this limit is calculated during the training process based on the block size and block point limit of the training data.
    • Maximum Training Voxels—The maximum number of voxels that can be used in the training data. The corresponding value should be a positive integer. When no value is provided, this limit is calculated during training.
    • Maximum Validation Voxels—The maximum number of voxels that can be used in the validation data. The corresponding value should be a positive integer. When no value is provided, this limit is calculated during training.
  • Value—The value that corresponds with the option being modified.

Value Table

Derived Output

LabelExplanationData Type
Output Model

The output object detection model that is produced.

File
Output Epoch Statistics

The output ASCII table that contains the epoch statistics that were obtained during the training process.

Text File

arcpy.ddd.TrainPointCloudObjectDetectionModel(in_training_data, out_model_location, out_model_name, {pretrained_model}, {architecture}, {attributes}, {min_points}, {remap_objects}, {target_objects}, {train_blocks}, {object_descriptions}, {model_selection_criteria}, {max_epochs}, {learning_rate_strategy}, {learning_rate}, {batch_size}, {early_stop}, {architecture_settings})
NameExplanationData Type
in_training_data

The point cloud object detection training data (*.pcotd file) that will be used to train the model.

File
out_model_location

An existing folder that will store the new directory containing the deep learning model.

Folder
out_model_name

The name of the output Esri model definition file (*.emd), deep learning package (*.dlpk), and the directory that will be created to store them.

String
pretrained_model
(Optional)

The pretrained object detection model that will be refined. When a pretrained model is provided, the input training data must have the same attributes and maximum number of points that were used by the training data that generated the model.

File
architecture
(Optional)

Specifies the architecture that will be used to train the model.

  • SECDThe Sparsely Embedded CONvolutional Detection (SECOND) architecture will be used. This is the default.
String
attributes
[attributes,...]
(Optional)

Specifies the point attributes that will be used with the classification code when training the model. Only the attributes that are present in the point cloud training data will be available. No additional attributes are included by default.

  • INTENSITYThe measure of the magnitude of the lidar pulse return will be used.
  • RETURN_NUMBERThe ordinal position of the point obtained from a given lidar pulse will be used.
  • NUMBER_OF_RETURNSThe total number of lidar returns that were identified as points from the pulse associated with a given point will be used.
  • REDThe red band's value from a point cloud with color information will be used.
  • GREENThe green band's value from a point cloud with color information will be used.
  • BLUEThe blue band's value from a point cloud with color information will be used.
  • NEAR_INFRAREDThe near infrared band's value from a point cloud with near infrared information will be used.
  • RELATIVE_HEIGHTThe relative height of each point in relation to a reference surface, which would typically be a bare earth DEM, will be used.
String
min_points
(Optional)

The minimum number of points that must be present in a given block for it to be used when training the model. The default is 0.

Long
remap_objects
[remap_objects,...]
(Optional)

Defines how object codes will be remapped to new values before training the deep learning model.

  • Current Code—The object code value in the training data
  • Remapped Code—The object code value that the existing code will be changed to

Value Table
target_objects
[target_objects,...]
(Optional)

The object codes that will be used to filter the objects in the training data. When object codes are provided, the objects that are not included will be ignored.

Long
train_blocks
(Optional)

Specifies whether the model will be trained using only blocks that contain objects or if all blocks, including those that do not contain objects.

  • OBJECT_BLOCKSThe model will be trained using only blocks that contain objects. The data used for validation will not be modified.
  • ALL_BLOCKSThe model will be trained using all blocks, including those that do not contain objects. This is the default.
Boolean
object_descriptions
[object_descriptions,...]
(Optional)

The descriptions for each object code in the training data.

  • Object Code—The object code value that was learned by the model
  • Description—The object described by the class code

Value Table
model_selection_criteria
(Optional)

Specifies the statistical basis that will be used to determine the final model.

  • VALIDATION_LOSSThe model that achieves the lowest result when the entropy loss function is applied to the validation data will be used.
  • AVERAGE_PRECISIONThe model that achieves the highest ratio of points in the validation data that were correctly classified by the model trained in the epoch (true positives) over all the points in the validation data will be used. This is the default.
String
max_epochs
(Optional)

The number of times each block of data will be passed forward and backward through the neural network. The default is 25.

Long
learning_rate_strategy
(Optional)

Specifies how the learning rate will be modified during training.

  • ONE_CYCLEThe learning rate will be cycled throughout each epoch using Fast.AI's implementation of the 1cycle technique for training neural networks to help improve the training of a convolutional neural network. This is the default.
  • FIXEDThe same learning rate will be used throughout the training process.
String
learning_rate
(Optional)

The rate at which existing information will be overwritten with new information. If no value is provided, the optimal learning rate will be extracted from the learning curve during the training process. This is the default.

Double
batch_size
(Optional)

The number of training data blocks that will be processed at any given time. The default is 2.

Long
early_stop
(Optional)

Specifies whether the model training will stop when the metric specified in the model_selection_criteria parameter does not register any improvement after five consecutive epochs.

  • EARLY_STOPThe model training will stop when the model is no longer improving.
  • NO_EARLY_STOPThe model training will continue until the maximum number of epochs has been reached. This is the default.
Boolean
architecture_settings
[architecture_settings,...]
(Optional)

The architecture settings that can be modified to improve training results.

  • Option—The architecture-specific options that can be modified.
    • VOXEL_WIDTH—The x- and y-dimensions of the voxel used during training. The corresponding value is in linear units of meters and can be expressed as a double value.
    • VOXEL_HEIGHT—The z-dimension of the voxel used during training. The corresponding value is in linear units of meters and can be expressed as a double value.
    • VOXEL_POINT_LIMIT—The number of points in a given voxel. The corresponding value should be a positive integer. When no value is provided, this limit is calculated during the training process based on the block size and block point limit of the training data.
    • MAX_TRAINING_VOXELS—The maximum number of voxels that can be used in the training data. The corresponding value should be a positive integer. When no value is provided, this limit is calculated during training.
    • MAX_VALIDATION_VOXELS—The maximum number of voxels that can be used in the validation data. The corresponding value should be a positive integer. When no value is provided, this limit is calculated during training.
  • Value—The value that corresponds with the option being modified.

Value Table

Derived Output

NameExplanationData Type
out_model

The output object detection model that is produced.

File
out_epoch_stats

The output ASCII table that contains the epoch statistics that were obtained during the training process.

Text File

Code sample

TrainPointCloudObjectDetectionModel example (stand-alone script)

The following sample demonstrates the use of this tool in the Python window.

import arcpy

arcpy.env.workspace = "D:/Deep_Learning_Workspace"
arcpy.ddd.TrainPointCloudObjectDetectionModel("Cars.pcotd", "D:/DL_Models", "Cars", 
    attributes=["INTENSITY", "RETURN_NUMBER", "NUMBER_OF_RETURNS", "RELATIVE_HEIGHT"],
    object_descriptions=[[31, "Cars"]], train_blocks="OBJECT_BLOCKS",
    model_selection_criteria="AVERAGE_PRECISION", max_epochs=10)

Environments

Licensing information

  • Basic: Requires 3D Analyst
  • Standard: Requires 3D Analyst
  • Advanced: Requires 3D Analyst