Zonal Statistics (Spatial Analyst)

Available with Spatial Analyst license.

Available with Image Analyst license.

Summary

Calculates statistics on values of a raster within the zones of another dataset.

Learn more about how the zonal statistics tools work

Illustration

Zonal Statistics illustration
OutRas = ZonalStatistics(ZoneRas, "VALUE", ValRas, "MINIMUM", "DATA", "CURRENT_SLICE")

Usage

  • A zone is defined as all areas in the input that have the same value. The areas do not have to be contiguous. Both raster and feature can be used for the zone input.

  • If the Input raster or feature zone data (in_zone_data in Python) is a raster, it must be an integer raster.

  • If the Input raster or feature zone data (in_zone_data in Python) is a feature, it will be converted to a raster internally using the cell size and cell alignment from the Input value raster (in_value_raster in Python).

  • When the cell size of the Input raster or feature zone data (in_zone_data in Python) and the Input value raster (in_value_raster in Python) is different, the output cell size will be the Maximum Of Inputs value, and the Input value raster will be used as the snap raster internally. If the cell size is the same but the cells are not aligned, the Input value raster will be used as the snap raster internally. Either of these cases will trigger an internal resampling before the zonal operation is performed.

    When the zone and value inputs are both rasters of the same cell size and the cells are aligned, they will be used directly in the tool and will not be resampled internally during tool execution.

  • If the Input raster or feature zone data (in_zone_data in Python) is a feature, for any of the zone features that do not overlap any cell centers of the value raster, those zones will not be converted to the internal zone raster. As a result, those zones will not be represented in the output. You can manage this by determining an appropriate value for the cell size environment that will preserve the desired level of detail of the feature zones, and specify it in the analysis environment.

  • If the Input raster or feature zone data (in_zone_data in Python) is a point feature, it is possible to have more than one point contained within any particular cell of the value input raster. For such cells, the zone value is determined by the point with the lowest ObjectID field (for example, OID or FID).

  • If the Input raster or feature zone data (in_zone_data in Python) has overlapping polygons, the zonal analysis will not be performed for each individual polygon. Since the feature input is converted to a raster, each location can have only one value.

    An alternative method is to process the zonal operation iteratively for each of the polygon zones and collate the results.

  • When specifying the Input raster or feature zone data (in_zone_data in Python), the default zone field will be the first available integer or text field. If no other valid fields exist, the ObjectID field (for example, OID or FID) will be the default.

  • The Input value raster (in_value_raster in Python) can be either integer or floating point. However, when it is floating-point type, the options for calculating majority, minority, and variety will not be available.

  • For majority and minority calculations, when there is a tie, the output for the zone is based on the lowest of the tied values. See How the zonal statistics tools work for more information.

  • Supported multidimensional raster dataset types include multidimensional raster layer, mosaic, image service and Esri's CRF.

  • The data type (integer or float) of the output is dependent on the zonal calculation being performed and the input value raster type.

  • By default, this tool will take advantage of multicore processors. The maximum number of cores that can be used is four.

    To use fewer cores, use the parallelProcessingFactor environment setting.

  • See Analysis environments and Spatial Analyst for additional details on the geoprocessing environments that apply to this tool.

Parameters

LabelExplanationData Type
Input raster or feature zone data

The dataset that defines the zones.

The zones can be defined by an integer raster or a feature layer.

Raster Layer; Feature Layer
Zone field

The field that contains the values that define each zone.

It can be an integer or a string field of the zone dataset.

Field
Input value raster

The raster that contains the values on which to calculate a statistic.

Raster Layer
Statistics type
(Optional)

Specifies the statistic type to be calculated.

  • Mean —The average of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • Majority —The value that occurs most often of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • Maximum —The largest value of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • Median —The median value of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • Minimum —The smallest value of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • Minority —The value that occurs least often of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • Percentile —The percentile of all cells in the value raster that belong to the same zone as the output cell will be calculated. The 90th percentile is calculated by default. You can specify other values (from 0 to 100) using the Percentile value parameter.
  • Range —The difference between the largest and smallest value of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • Standard deviation —The standard deviation of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • Sum —The total value of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • Variety —The number of unique values for all cells in the value raster that belong to the same zone as the output cell will be calculated.
String
Ignore NoData in calculations
(Optional)

Specifies whether NoData values in the value input will be ignored in the results of the zone that they fall within.

  • Checked—Within any particular zone, only cells that have a value in the input value raster will be used in determining the output value for that zone. NoData cells in the value raster will be ignored in the statistic calculation. This is the default.
  • Unchecked—Within any particular zone, if NoData cells exist in the value raster, they will not be ignored and their existence indicates that there is insufficient information to perform statistical calculations for all the cells in that zone. Consequently, the entire zone will receive the NoData value on the output raster.
Boolean
Process as multidimensional
(Optional)

Specifies how the input rasters will be processed if they are multidimensional.

  • Unchecked—Statistics will be calculated from the current slice of the input multidimensional dataset. This is the default.
  • Checked—Statistics will be calculated for all dimensions of the input multidimensional dataset.
Boolean
Percentile value
(Optional)

The percentile to calculate. The default is 90, indicating the 90th percentile.

The values can range from 0 to 100. The 0th percentile is essentially equivalent to the minimum statistic, and the 100th percentile is equivalent to maximum. A value of 50 will produce essentially the same result as the median statistic.

This option is only available if the Statistics type parameter is set to Percentile.

Double
Percentile interpolation type
(Optional)

Specifies the method of percentile interpolation to be used when the number of values from the input raster to be calculated are even.

  • Auto-detect — If the input value raster is of integer pixel type, the Nearest method is used. If the input value raster is of floating point pixel type, the Linear method is used. This is the default.
  • Nearest — The nearest available value to the desired percentile is used. In this case, the output pixel type is the same as that of the input value raster.
  • Linear —The weighted average of the two surrounding values from the desired percentile is used. In this case, the output pixel type is floating point.
String

Return Value

LabelExplanationData Type
Output raster

The output zonal statistics raster.

Raster

ZonalStatistics(in_zone_data, zone_field, in_value_raster, {statistics_type}, {ignore_nodata}, {process_as_multidimensional}, {percentile_value}, {percentile_interpolation_type})
NameExplanationData Type
in_zone_data

The dataset that defines the zones.

The zones can be defined by an integer raster or a feature layer.

Raster Layer; Feature Layer
zone_field

The field that contains the values that define each zone.

It can be an integer or a string field of the zone dataset.

Field
in_value_raster

The raster that contains the values on which to calculate a statistic.

Raster Layer
statistics_type
(Optional)

Specifies the statistic type to be calculated.

  • MEANThe average of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • MAJORITYThe value that occurs most often of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • MAXIMUMThe largest value of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • MEDIANThe median value of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • MINIMUMThe smallest value of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • MINORITYThe value that occurs least often of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • PERCENTILEThe percentile of all cells in the value raster that belong to the same zone as the output cell will be calculated. The 90th percentile is calculated by default. You can specify other values (from 0 to 100) using the Percentile value parameter.
  • RANGEThe difference between the largest and smallest value of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • STDThe standard deviation of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • SUMThe total value of all cells in the value raster that belong to the same zone as the output cell will be calculated.
  • VARIETYThe number of unique values for all cells in the value raster that belong to the same zone as the output cell will be calculated.
String
ignore_nodata
(Optional)

Specifies whether NoData values in the value input will be ignored in the results of the zone that they fall within.

  • DATAWithin any particular zone, only cells that have a value in the input value raster will be used in determining the output value for that zone. NoData cells in the value raster will be ignored in the statistic calculation. This is the default.
  • NODATAWithin any particular zone, if NoData cells exist in the value raster, they will not be ignored and their existence indicates that there is insufficient information to perform statistical calculations for all the cells in that zone. Consequently, the entire zone will receive the NoData value on the output raster.
Boolean
process_as_multidimensional
(Optional)

Specifies how the input rasters will be processed if they are multidimensional.

  • CURRENT_SLICEStatistics will be calculated from the current slice of the input multidimensional dataset. This is the default.
  • ALL_SLICESStatistics will be calculated for all dimensions of the input multidimensional dataset.
Boolean
percentile_value
(Optional)

The percentile to calculate. The default is 90, indicating the 90th percentile.

The values can range from 0 to 100. The 0th percentile is essentially equivalent to the minimum statistic, and the 100th percentile is equivalent to maximum. A value of 50 will produce essentially the same result as the median statistic.

This option is only supported if the statistics_type parameter is set to PERCENTILE.

Double
percentile_interpolation_type
(Optional)

Specifies the method of percentile interpolation to be used when the number of values from the input raster to be calculated are even.

  • AUTO_DETECTIf the input value raster is of integer pixel type, the NEAREST method is used. If the input value raster is of floating point pixel type, the LINEAR method is used. This is the default.
  • NEARESTThe nearest available value to the desired percentile is used. In this case, the output pixel type is the same as that of the input value raster.
  • LINEARThe weighted average of the two surrounding values from the desired percentile is used. In this case, the output pixel type is floating point.
String

Return Value

NameExplanationData Type
out_raster

The output zonal statistics raster.

Raster

Code sample

ZonalStatistics example 1 (Python window)

This example determines for each zone the range of cell values in the Value input raster.

import arcpy
from arcpy import env
from arcpy.sa import *
env.workspace = "C:/sapyexamples/data"
outZonalStats = ZonalStatistics("zone", "value", "valueraster", "RANGE",
                                "NODATA")
outZonalStats.save("C:/sapyexamples/output/zonestatout")
ZonalStatistics example 2 (stand-alone script)

This example creates a multidimensional zonal output by calculating the maximum value from the input multidimensional Value raster for each zone.

# Name: ZonalStatistics_Ex_standalone.py
# Description: Summarizes values of a multidimensional raster within the zones 
#              of another dataset.
# Requirements: Spatial Analyst Extension

# Import system modules
import arcpy
from arcpy.sa import *

# Check out the ArcGIS Spatial Analyst extension license
arcpy.CheckOutExtension("Spatial")

# Set the analysis environments
arcpy.env.workspace = "C:/sapyexamples/data"

# Set the local variables
inZoneData = "zones.shp"
zoneField = "sampleID"
inValueRaster = "multidimensional_valueraster.crf" 

# Execute ZonalStatistics
outZonalStatistics = ZonalStatistics(inZoneData, zoneField, inValueRaster,
                                     "MAXIMUM", "NODATA", "ALL_SLICES")

# Save the output 
outZonalStatistics.save("C:/sapyexamples/output/zonestatout2.crf")

Licensing information

  • Basic: Requires Spatial Analyst or Image Analyst
  • Standard: Requires Spatial Analyst or Image Analyst
  • Advanced: Requires Spatial Analyst or Image Analyst

Related topics