Multidimensional Principal Components (Image Analyst)

Available with Image Analyst license.

Summary

Transforms multidimensional rasters into their principal components, loadings, and eigenvalues. It transforms the data into a reduced number of components that account for the variance of the data, so that spatial and temporal patterns can be readily identified.

Usage

  • Use eigenvalues and cumulative percentages of variances in the Output Eigenvalues table to determine the number of principal components needed to define the data without losing essential information.

    Eigenvalue table

    In the example above, the first component shows 72.51 percent of the variance. To reach a value of 95 percent of the variance, choose the first five components.

  • Apply charts on the Output Loadings parameter value to understand how each raster in the Input Multidimensional Raster parameter contributes to a principal component.

  • The Number of Principal Components parameter specifies the number of bands in the output. To avoid the output of an unnecessarily large raster, use an appropriate percentage or number of components. Typically, the first few components will cover the most variance in the data.

  • The Output Principal Components parameter value contains multiple bands, and each component is represented as a band. Use the Stretch renderer with band option to visualize each principal component.

Parameters

LabelExplanationData Type
Input Multidimensional Raster

The input multidimensional raster.

The tool processes data along one dimension, such as a time series raster or a data cube defined by a nontime dimension [X, Y, Z]. If an input variable includes multiple dimensions, such as depth and time, the first dimension value will be used by default.

You can use the Make Multidimensional Raster Layer tool or Subset Multidimensional Raster tool to redefine the multidimensional data as needed, such as configuring multidimensional data into a dataset with one dimension.

Raster Dataset; Mosaic Dataset; Raster Layer; Mosaic Layer; Image Service; File
Mode

Specifies the method that will be used to perform principal component analysis.

  • Dimension ReductionThe input time series data will be treated as a set of images. Principal components that extract prevalent pattens over time will be computed. This is the default.
Note:

Other options for this parameter are not supported in the Geoprocessing pane in ArcGIS Pro 2.9. The only supported option is Dimension Reduction. Therefore, the parameter is not visible in the geoprocessing tool dialog.Additional options will be supported in a future release.

String
Dimension

The dimension name used to process the principal components.

String
Output Principal Components

The name of the output raster dataset. The output is a multiband raster with the components as bands. The first band is the first principal component with the largest eigenvalue, the second band has the principal component with the second largest eigenvalue, and so on.

The output is in CRF file format (.crf).

Raster Dataset; Table
Output Loadings

The output table containing weights for each input raster contributing to the principal components. It is the correlation of the input data and output principal components. Use the .csv file extension to output the loadings as a comma-separated values file.

Table; Raster Dataset
Output Eigenvalues
(Optional)

The output Eigenvalues table. Eigenvalues are values indicating the variance percentage of each component. Eigenvalues help you define the number of principal components that are needed to represent the dataset.

Table
Variable
(Optional)

The variable of the input multidimensional raster used in computation. If the input raster is multidimensional and no variable is specified, only the first variable will be analyzed, by default.

For example, to find the years in which temperature values were highest, specify temperature as the variable to be analyzed. If you do not specify any variables and you have both temperature and precipitation variables, both variables will be analyzed, and the output multidimensional raster will include both variables.

String
Number of Principal Components
(Optional)

The number of principal components to compute, usually fewer than the number of input rasters.

This parameter also takes the form of percentage (%). For example, 90% means the number of components that can explain 90% of variance in the data will be computed.

String

MultidimensionalPrincipalComponents(in_multidimensional_raster, mode, dimension, out_pc, out_loadings, {out_eigenvalues}, {variable}, {number_of_pc})
NameExplanationData Type
in_multidimensional_raster

The input multidimensional raster.

The tool processes data along one dimension, such as a time series raster or a data cube defined by a nontime dimension [X, Y, Z]. If an input variable includes multiple dimensions, such as depth and time, the first dimension value will be used by default.

You can use the Make Multidimensional Raster Layer tool or Subset Multidimensional Raster tool to redefine the multidimensional data as needed, such as configuring multidimensional data into a dataset with one dimension.

Raster Dataset; Mosaic Dataset; Raster Layer; Mosaic Layer; Image Service; File
mode

Specifies the method that will be used to perform principal component analysis.

  • DIMENSION_REDUCTIONThe input time series data will be treated as a set of images. Principal components that extract prevalent pattens over time will be computed. This is the default.
Note:

At 2.9, the only supported option is DIMENSION_REDUCTON. Additional options will be supported in a future release.

String
dimension

The dimension name used to process the principal components.

String
out_pc

The name of the output raster dataset. The output is a multiband raster with the components as bands. The first band is the first principal component with the largest eigenvalue, the second band has the principal component with the second largest eigenvalue, and so on.

The output is in CRF file format (.crf).

Raster Dataset; Table
out_loadings

The output table containing weights for each input raster contributing to the principal components. It is the correlation of the input data and output principal components. Use the .csv file extension to output the loadings as a comma-separated values file.

Table; Raster Dataset
out_eigenvalues
(Optional)

The output Eigenvalues table. Eigenvalues are values indicating the variance percentage of each component. Eigenvalues help you define the number of principal components that are needed to represent the dataset.

Table
variable
(Optional)

The variable of the input multidimensional raster used in computation. If the input raster is multidimensional and no variable is specified, only the first variable will be analyzed, by default.

For example, to find the years in which temperature values were highest, specify temperature as the variable to be analyzed. If you do not specify any variables and you have both temperature and precipitation variables, both variables will be analyzed, and the output multidimensional raster will include both variables.

String
number_of_pc
(Optional)

The number of principal components to compute, usually fewer than the number of input rasters.

This parameter also takes the form of percentage (%). For example, 90% means the number of components that can explain 90% of variance in the data will be computed.

String

Code sample

MultidimensionalPrincipalComponents example 1 (Python window)

This example computes three principal components from an NDVI time series raster. The input and output data are all in a directory named c:\data.

# Import system modules 
import arcpy 
from arcpy.ia import *  

# Check out the ArcGIS Image Analyst extension license 
arcpy.CheckOutExtension("ImageAnalyst") 

arcpy.env.workspace = r"c:\data" 
arcpy.ia.MultidimensionalPrincipalComponents('ndviData.crf', 'DIMENSION_REDUCTION', "StdTime", "ndviData_PC.crf", "ndviData_loadings.csv", "ndviData_eiganvalues.csv", None, 3)
MultidimensionalPrincipalComponents example 2 (stand-alone script)

This example computes four principal components from an NDVI time series raster. The input and output data are all in a directory named c:\data.

# Import system modules 
import arcpy 
from arcpy.ia import * 

# Check out the ArcGIS Image Analyst extension license 
arcpy.CheckOutExtension("ImageAnalyst") 

# Define input parameters 
inputFile = r"c:\data\ndviData.crf" 
mode = "DIMENSION_REDUCTION" 
dimension = "StdTime" 
out_pc = r"c:\data\ndviData_pc.tif" 
out_loadings = r"c:\data\ndviData_loadings.csv" 
out_eiganvalues = r"c:\data\ndviData_pc.csv" 
variable = "ndvi" 
pc_number = 4 
  
# Execute  

arcpy.ia.MultidimensionalPrincipalComponents(inputFile, mode, dimension, out_pc, out_loadings, out_eiganvalues, variable, pc_number)

Licensing information

  • Basic: Requires Image Analyst
  • Standard: Requires Image Analyst
  • Advanced: Requires Image Analyst

Related topics