Label | Explanation | Data Type |
Input Raster | The raster dataset to classify. You can use any Esri-supported raster dataset. One option is a 3-band, 8-bit segmented raster dataset in which all the pixels in the same segment have the same color. The input can also be a single band, 8-bit, grayscale segmented raster. | Raster Layer; Mosaic Layer; Image Service; String |
Input Training Sample File | The training sample file or layer that delineates the training sites. These can be either shapefiles or feature classes that contain the training samples. The following field names are required in the training sample file:
| Feature Layer |
Output Classifier Definition File | A JSON file that contains attribute information, statistics, or other information for the classifier. An .ecd file is created. | File |
Additional Input Raster (Optional) | Ancillary raster datasets, such as a multispectral image or a DEM, will be incorporated to generate attributes and other required information for classification. This parameter is optional. | Raster Layer; Mosaic Layer; Image Service; String |
Max Number of Trees (Optional) | The maximum number of trees in the forest. Increasing the number of trees will lead to higher accuracy rates, although this improvement will level off eventually. The number of trees increases the processing time linearly. | Long |
Max Tree Depth (Optional) | The maximum depth of each tree in the forest. Depth is another way of saying the number of rules each tree is allowed to create to come to a decision. Trees will not grow any deeper than this setting. | Long |
Max Number of Samples Per Class (Optional) | The maximum number of samples that will be used to define each class. The default value of 1000 is recommended when the inputs are nonsegmented rasters. A value that is less than or equal to 0 means that the system will use all the samples from the training sites to train the classifier. | Long |
Segment Attributes (Optional) | Specifies the attributes that will be included in the attribute table associated with the output raster. This parameter is only active if the Segmented key property is set to true on the input raster. If the only input to the tool is a segmented image, the default attributes are Converged color, Count of pixels, Compactness, and Rectangularity. If an Additional Input Raster value is included as an input with a segmented image, Mean digital number and Standard deviation are also available attributes.
| String |
Dimension Value Field (Optional) | Contains dimension values in the input training sample feature class. This parameter is required to classify a time series of raster data using the change analysis raster output from the Analyze Changes Using CCDC tool. | Field |
Available with Spatial Analyst license.
Available with Image Analyst license.
Summary
Generates an Esri classifier definition file (.ecd) using the Random Trees classification method.
The random trees classifier is an image classification technique that is resistant to overfitting and can work with segmented images and other ancillary raster datasets. For standard image inputs, the tool accepts multiband imagery with any bit depth, and it will perform the Random Trees classification on a pixel basis or segment, based on the input training feature file.
Usage
The Random Trees classification method is a collection of individual decision trees in which each tree is generated from different samples and subsets of the training data. The idea behind calling these decision trees is that for every pixel that is classified, a number of decisions are made in rank order of importance. When you graph these for a pixel, it looks like a branch. When you classify the entire dataset, the branches form a tree. This method is called random trees because you are actually classifying the dataset a number of times based on a random subselection of training pixels, resulting in many decision trees. To make a final decision, each tree has a vote. This process works to mitigate overfitting. The Random Trees classification method is a supervised machine-learning classifier based on constructing a multitude of decision trees, choosing random subsets of variables for each tree, and using the most frequent tree output as the overall classification. The Random Trees classification method corrects for the decision trees' propensity for overfitting to their training sample data. With this method, a number of trees are grown—by an analogy, a forest—and variation among the trees is introduced by projecting the training data into a randomly chosen subspace before fitting each tree. The decision at each node is optimized by a randomized procedure.
For segmented rasters that have their key property set to Segmented, the tool computes the index image and associated segment attributes from the RGB segmented raster. The attributes are computed to generate the classifier definition file to be used in a separate classification tool. The attributes for each segment can be computed from any Esri-supported image.
Any Esri-supported raster is accepted as input, including raster products, segmented rasters, mosaics, image services, and generic raster datasets. Segmented rasters must be 8-bit rasters with 3 bands.
To create the training sample file, use the Training Samples Manager pane from the Classification Tools drop-down menu.
The Segment Attributes parameter is only active if one of the raster layer inputs is a segmented image.
To classify time series raster data using the Continuous Change Detection and Classification (CCDC) algorithm, first run the Analyze Changes Using CCDC tool and use the output change analysis raster as the input raster for this training tool.
The training sample data must have been collected at multiple times using the Training Samples Manager. The dimension value for each sample is listed in a field in the training sample feature class, which is specified in the Dimension Value Field parameter.
Parameters
TrainRandomTreesClassifier(in_raster, in_training_features, out_classifier_definition, {in_additional_raster}, {max_num_trees}, {max_tree_depth}, {max_samples_per_class}, {used_attributes}, {dimension_value_field})
Name | Explanation | Data Type |
in_raster | The raster dataset to classify. You can use any Esri-supported raster dataset. One option is a 3-band, 8-bit segmented raster dataset in which all the pixels in the same segment have the same color. The input can also be a single band, 8-bit, grayscale segmented raster. | Raster Layer; Mosaic Layer; Image Service; String |
in_training_features | The training sample file or layer that delineates the training sites. These can be either shapefiles or feature classes that contain the training samples. The following field names are required in the training sample file:
| Feature Layer |
out_classifier_definition | A JSON file that contains attribute information, statistics, or other information for the classifier. An .ecd file is created. | File |
in_additional_raster (Optional) | Ancillary raster datasets, such as a multispectral image or a DEM, will be incorporated to generate attributes and other required information for classification. This parameter is optional. | Raster Layer; Mosaic Layer; Image Service; String |
max_num_trees (Optional) | The maximum number of trees in the forest. Increasing the number of trees will lead to higher accuracy rates, although this improvement will level off eventually. The number of trees increases the processing time linearly. | Long |
max_tree_depth (Optional) | The maximum depth of each tree in the forest. Depth is another way of saying the number of rules each tree is allowed to create to come to a decision. Trees will not grow any deeper than this setting. | Long |
max_samples_per_class (Optional) | The maximum number of samples that will be used to define each class. The default value of 1000 is recommended when the inputs are nonsegmented rasters. A value that is less than or equal to 0 means that the system will use all the samples from the training sites to train the classifier. | Long |
used_attributes [used_attributes;used_attributes,...] (Optional) | Specifies the attributes that will be included in the attribute table associated with the output raster.
This parameter is only enabled if the Segmented key property is set to true on the input raster. If the only input to the tool is a segmented image, the default attributes are COLOR, COUNT, COMPACTNESS, and RECTANGULARITY. If an in_additional_raster value is included as an input with a segmented image, MEAN and STD are also available attributes. | String |
dimension_value_field (Optional) | Contains dimension values in the input training sample feature class. This parameter is required to classify a time series of raster data using the change analysis raster output from the Analyze Changes Using CCDC tool. | Field |
Code sample
This is a Python sample for the TrainRandomTreesClassifier tool.
import arcpy
from arcpy.ia import *
# Check out the ArcGIS Image Analyst extension license
arcpy.CheckOutExtension("ImageAnalyst")
TrainRandomTreesClassifier("c:/test/moncton_seg.tif",
"c:/test/train.gdb/train_features",
"c:/output/moncton_sig_SVM.ecd",
"c:/test/moncton.tif", "50", "30", "1000",
"COLOR;MEAN;STD;COUNT;COMPACTNESS;RECTANGULARITY")
This is a Python script sample for the TrainRandomTreesClassifier tool.
# Import system modules
import arcpy
from arcpy.ia import *
# Set local variables
inSegRaster = "c:/test/cities_seg.tif"
train_features = "c:/test/train.gdb/train_features"
out_definition = "c:/output/cities_sig.ecd"
in_additional_raster = "c:/cities.tif"
maxNumTrees = "50"
maxTreeDepth = "30"
maxSampleClass = "1000"
attributes = "COLOR;MEAN;STD;COUNT;COMPACTNESS;RECTANGULARITY"
# Check out the ArcGIS Image Analyst extension license
arcpy.CheckOutExtension("ImageAnalyst")
# Execute
TrainRandomTreesClassifier(inSegRaster, train_features,
out_definition, in_additional_raster, maxNumTrees,
maxTreeDepth, maxSampleClass, attributes)
This example shows how to train a random trees classifier using a change analysis raster from the Analyze Changes Using CCDC tool.
# Import system modules
import arcpy
from arcpy.ia import *
# Check out the ArcGIS Image Analyst extension license
arcpy.CheckOutExtension("ImageAnalyst")
# Define input parameters
in_changeAnalysisRaster = "c:/test/LandsatCCDC.crf"
train_features = "c:/test/train.gdb/train_features"
out_definition = "c:/output/change_detection.ecd"
in_additional_raster = ""
maxNumTrees = 50
maxTreeDepth = 30
maxSampleClass = 1000
attributes = None
dimension_field = "DateTime"
# Execute
arcpy.ia.TrainRandomTreesClassifier(
in_changeAnalysisRaster, train_features, out_definition,
in_additional_raster, maxNumTrees, maxTreeDepth, maxSampleClass,
attributes, dimension_field)
Environments
Licensing information
- Basic: Requires Image Analyst or Spatial Analyst
- Standard: Requires Image Analyst or Spatial Analyst
- Advanced: Requires Image Analyst or Spatial Analyst