Available with Image Analyst license.
Available with Spatial Analyst license.
Classify an image
The Classify tool allows you to choose from either unsupervised or supervised classification techniques to classify pixels or objects in a raster dataset. To display the Classify tool, select the raster that is to be classified in the Contents pane, then on the Imagery tab, click the Classification Tools drop-down arrow.
For supervised classification, you need to provide a training samples file. You can create training samples using the Training Samples Manager in the Classification Tools drop-down list, or you can provide an existing training samples file. This can be either a shapefile or feature class that contains your training samples. The following field names are required in the training sample file:
- classname—A text field indicating the name of the class category
- classvalue—A long integer field containing the integer value for each class category
For object-based image analysis, you need to provide a segmented image. You can create a segmented image using the Segmentation tool in the Classification Tools drop-down list.
There are five Classifiers available to classify your data:
- ISO Cluster—The ISO Cluster classifier performs an unsupervised classification. This classifier can process very large segmented images, whose attribute table can be large. Also, the tool can accept a segmented RGB raster from a third-party application. The tool works on standard Esri-supported raster files, as well as segmented raster datasets.
- Maximum Likelihood—The maximum likelihood classifier is a traditional technique for image classification. It is based on two principles: the pixels in each class sample in the multidimensional space are normally distributed, and Bayes' theorem of decision-making.
- Random Trees—The random trees classifier is a powerful technique for image classification that is resistant to overfitting, and can work with segmented images and other ancillary raster datasets. For standard image inputs, the tool accepts multiband imagery with any bit depth, and it will perform the random trees classification on a pixel basis, based on the input training feature file.
- Support Vector Machine (SVM)—The SVM classifier provides a powerful, supervised classification method that can process a segmented raster input or a standard image. It is less susceptible to noise, correlated bands, and an unbalanced number or size of training sites within each class. This is a classification method that is widely used among researchers.
- Train K-Nearest Neighbor—The K-Nearest Neighbor classifier is a nonparametric classification method that classifies a pixel or segment by a plurality vote of its neighbors. K is the defined number of neighbors used in voting.
The types of classifiers available in the Image Classification tool are described below.
ISO Cluster
Perform an unsupervised classification using the ISO Cluster algorithm, which determines the characteristics of the natural groupings of cells in a multidimensional attribute space.
Parameter name | Description |
---|---|
Maximum Number of Classes | Maximum number of desired classes to group pixels or segments. This should be set based on the number of classes in your legend. It is possible that you will get fewer classes than what you specified for this parameter. If you need more, increase this value and aggregate classes after the training process is complete. |
Maximum Number of Iterations | The maximum number of iterations for the clustering process to run. The recommended range is between 10 and 20 iterations. Increasing this value will linearly increase the processing time. |
Maximum Number of Cluster Merges per Iteration | The maximum number of times that a cluster can be merged. Increasing the number of merges will reduce the number of classes that are created. A lower value will result in more classes. |
Maximum Merge Distance | Increasing the distance will allow more clusters to merge, resulting in fewer classes. A lower value will result in more classes. The distance is spectral in nature and is based on RGB color. For example, the distance between a pixel with an RGB value of 100, 100, 100 has a distance of 50 from a pixel with an RGB value of 100, 130, 120. Although you can set this to any value, values from 0 to 5 tend to give the best results. |
Minimum Samples Per Cluster | The minimum number of pixels or segments in a valid cluster or class. The default value of 20 has shown to be effective in creating statistically significant classes. You can increase this number to have more robust classes; however, it may limit the overall number of classes that are created. |
Skip Factor | Number of pixels to skip for a pixel image input. If a segmented image is an input, specify the number of segments to skip. |
Segmented Image | Optionally incorporate a segmented image to perform object-based classification. |
Segment Attributes | When you use a segmented image, you can choose which attributes to use from the segmented image:
|
Output Classified Dataset | Choose the name and output location for your classified output. |
Output Classified Definition File | This is a JSON file that contains attribute information, statistics, hyperplane vectors and other information needed for the classifier. A file with an .ecd extension is created. |
Maximum Likelihood
Perform a maximum likelihood classification, which is based on two principles: the pixels in each class sample in the multidimensional space are normally distributed, and Bayes' theorem of decision-making.
Parameter name | Description |
---|---|
Training Samples | Select the training sample file or layer that delineates your training sites. These can be either shapefiles or feature classes that contain your training samples. |
Segmented Image | Optionally incorporate a segmented image to perform object-based classification. |
Segment Attributes | When you use a segmented image, you can choose which attributes to use from the segmented image:
|
Output Classified Dataset | Choose the name and output location for your classified output. |
Output Classified Definition File | This is a JSON file that contains attribute information, statistics, hyperplane vectors and other information needed for the classifier. A file with an .ecd extension is created. |
Random Trees
Perform a random trees classification, which uses multiple decision trees that are trained using small variations of the same training data. When classifying a sample, the majority vote of these trained trees decides on the output class. This set of trees is less vulnerable to overfitting than a single tree.
Parameter name | Description |
---|---|
Training Samples | Select the training sample file or layer that delineates your training sites. These can be either shapefiles or feature classes that contain your training samples. |
Maximum Number of Trees | The maximum number of trees in the forest. Increasing the number of trees will lead to higher accuracy rates, although this improvement will level off eventually. The number of trees increases the processing time linearly. |
Maximum Tree Depth | The maximum depth of each tree in the forest. Depth is another way of saying the number of rules each tree is allowed to create to come to a decision. Trees will not grow any deeper than this setting. |
Maximum Number of Samples per Class | The maximum number of samples to use for defining each class. The default value of 1000 is recommended when the inputs are nonsegmented rasters. A value that is less than or equal to 0 means that the system will use all the samples from the training sites to train the classifier. |
Segmented Image | Optionally incorporate a segmented image to perform object-based classification. |
Segment Attributes | When you use a segmented image, you can choose which attributes to use from the segmented image:
|
Output Classified Dataset | Choose the name and output location for your classified output. |
Output Classified Definition File | This is a JSON file that contains attribute information, statistics, hyperplane vectors and other information needed for the classifier. A file with an .ecd extension is created. |
Support Vector Machine
Perform a support vector machine classification, which maps your input data vectors into a higher-dimensional feature space to optimally separate the data into the different classes. Support vector machines can process very large images, and this classification is less susceptible to noise, correlated bands, or an unbalanced number or size of training sites within each class.
Parameter name | Description |
---|---|
Training Samples | Select the training sample file or layer that delineates your training sites. These can be either shapefiles or feature classes that contain your training samples. |
Maximum Number of Samples per Class | The maximum number of samples to use for defining each class. The default value of 500 is recommended when the inputs are nonsegmented rasters. A value that is less than or equal to 0 means that the system will use all the samples from the training sites to train the classifier. |
Segmented Image | Optionally incorporate a segmented image to perform object-based classification. |
Segment Attributes | When you use a segmented image, you can choose which attributes to use from the segmented image:
|
Output Classified Dataset | Choose the name and output location for your classified output. |
Output Classified Definition File | This is a JSON file that contains attribute information, statistics, hyperplane vectors and other information needed for the classifier. A file with an .ecd extension is created. |
K-Nearest Neighbor
Perform a nonparametric classification method that classifies a pixel or segment by a plurality vote of its neighbors. K is the defined number of neighbors used in voting.
Parameter name | Description |
---|---|
Training Samples | Select the training sample file or layer that delineates your training sites. These can be either shapefiles or feature classes that contain your training samples. |
Dimension Value Field | Contains dimension values in the input training sample feature class. This parameter is required to classify a time series of raster data using the change analysis raster output from the Analyze Changes Using CCDC tool. |
K Nearest Neighbors | The number of neighbors that will be used in searching for each input pixel or segment. Increasing the number of neighbors will decrease the influence of individual neighbors on the outcome of the classification. The default value is 1. |
Maximum Number of Samples per Class | The maximum number of training samples that will be used for each class. The default value of 1000 is recommended when the inputs are nonsegmented rasters. A value that is less than or equal to 0 means that the system will use all the samples from the training sites to train the classifier. |
Segmented Image | Optionally incorporate a segmented image to perform object-based classification. |
Segment Attributes | When you use a segmented image, you can choose which attributes to use from the segmented image:
|
Output Classified Dataset | Choose the name and output location for your classified output. |
Output Classified Definition File | This is a JSON file that contains attribute information, statistics, hyperplane vectors and other information needed for the classifier. A file with an .ecd extension is created. |
Additional methods for classification and pattern recognition
Additional methods for classification and pattern recognition are included in the Image Analyst toolbox, and are described below.
- Classify Raster Using Spectra—Classifies a multiband raster dataset using spectral matching techniques, based on either the vector angle or spectral information divergence between the input image and the reference spectral profile. Outputs include the output classified raster, the output classifier definition file (.ecd), and the output score raster. The Output Score Raster is a multiband raster that stores the matching results for each end member.
- Train Random Trees Regression Model— Models the relationship between explanatory variables (independent variables) and a target dataset (dependent variable). The tool can be used to train with a variety of data types. The input rasters (explanatory variables) can be one raster or a list of rasters, a single band or a multiband in which each band is an explanatory variable, a multidimensional raster in which the variables in the raster are the explanatory variables, or a combination of data types.
Outputs include the following:
- a table containing information describing the importance of each explanatory variable used in the model
- scatter plots of training data, test data, and location test data
- the regression definition (.ecd) that contains attribute information, statistics, or other information for the classifier
- Predict Using Regression Model—Predicts data values using the output from the Train Random Trees Regression Model tool. The output is a raster of the predicted values.
Related topics
- Classify function
- Classify Raster
- Train ISO Cluster Classifier
- Train Maximum Likelihood Classifier
- Train Random Trees Classifier
- Train Support Vector Machine Classifier
- Generate Training Samples From Seed Points
- Inspect Training Samples
- Segment Mean Shift function
- ML Classify function
- Understanding segmentation and classification
- Train Random Trees Regression Model
- Predict Using Regression Model