Classify a point cloud with deep learning

You can use deep learning to classify LAS format point clouds and classify many kinds of features. It doesn't use predefined rules to identify specific things like buildings or ground. Rather, you provide examples of features of interest, and those are used to train a neural network that can then recognize and classify those features in other data.

You can use deep learning models created elsewhere or create your own. Most users will probably opt for models made by expert data scientists because it takes time and effort to do that work. Look to ArcGIS Living Atlas to see if any models are available that are appropriate for your project. If not, consider creating your own model. See train a deep learning model for point cloud classification for more information on creating your own model. You can use the Evaluate Point Cloud Classification Model tool to see, from a statistical perspective, how well a trained model will perform on your specific data.

Whether you're using someone else's deep learning model or your own, you need to make sure the data you intend to classify is similar to the data used to train the model. Ideally, it comes from the same data collection project. If not, it should at least share qualities. For example, a model trained with airborne lidar would be appropriate for classifying airborne lidar, not photogrammetric/SfM point clouds. The nominal point spacing should be similar, and if other attributes were included in the modeling, such as intensity or return number, those should be similar as well.

Using the Classify Point Cloud Using Trained Model tool

The Classify Point Cloud Using Trained Model geoprocessing tool takes as input a LAS dataset and a deep learning model. The LAS dataset references one or more LAS files and it's those that will be edited by the tool. The model can be either an Esri Model Definition file (*.emd) or a Deep Learning Package (*.dlpk). Both of those are output from the training tool. The difference is that you can publish and share .dlpk files online; they're self-contained. The *.emd files, on the other hand, reference other data, specifically *.pth files, and those must be present for the model to work.

Once the model is added as input to the tool, the list of classes it was trained to classify will be shown on the tool dialog box. By default, all classes are selected. You can uncheck any that are not of interest.

Another parameter called Existing Class Code Handling allows you to control what is allowed to be modified in the target LAS point cloud. By default, all points in the target point cloud are editable. Alternately, you can specify that points with only certain class codes are allowed to be changed. Others will remain intact in spite of what the deep learning model predicts them to be. You can also choose the inverse, if it's more convenient, to state that points with certain codes are not allowed to be changed. For example, if the target point cloud was already classified for ground and you want those left as-is, opt to preserve points that are class 2 (which represent ground).

The Batch Size parameter influences the performance of the classification process. It represents the number of data blocks that are provided to the GPU at a time. The higher the value, the faster the process, because the GPU operates on them in parallel. The cost is memory. You can only process as many blocks as permitted by available GPU memory. By default, when the batch size is unspecified, the tool attempts to find a reasonable value on its own. The value it uses will be written to output messages. It's possible that a larger value can be used, so you can override the default by specifying a value. During a test run, you can monitor the GPU's memory use. If a large amount of GPU memory remains available during classification, you can safely increase the batch size to process more blocks at a time.

Related topics