Skip To Content

Summary Statistics

Summary

Calculates summary statistics for field(s) in a table.

Usage

  • The Output Table will consist of fields containing the result of the statistical operation.

  • The following statistical operations are available with this tool: sum, mean, minimum, maximum, range, standard deviation, count, first, last, median and variance.

  • A field will be created for each statistic type using the following naming convention: SUM_<field>, MEAN_<field>, MIN_<field>, MAX_<field>, RANGE_<field>, STD_<field>, COUNT_<field>, FIRST_<field>, LAST_<field>, MEDIAN_<field>, VARIANCE_<field> (where <field> is the name of the input field for which the statistic is computed). The field name is truncated to 10 characters when the output table is a dBASE table.

  • If a Case field is specified, statistics will be calculated separately for each unique attribute value. The Output Table will contain only one record if no Case field is specified. If one is specified, there will be one record for each Case field value.

  • Null values are excluded from all statistical calculations. For example, the AVERAGE of 10, 5, and NULL is 7.5 ((10+5)/2). The COUNT tool returns the number of values included in the statistical calculation, which in this case is 2.

  • The Statistics Field(s) parameter Add Field button is used only in ModelBuilder. In ModelBuilder, where the preceding tool has not been run, or its derived data does not exist, the Statistics Field(s) parameter may not be populated with field names. The Add Field button allows you to add expected field(s) so you can complete the Summary Statistics dialog box and continue to build your model.

  • When using layers, only the currently selected features are used to calculate statistics.

Syntax

Statistics_analysis (in_table, out_table, {statistics_fields}, {case_field})
ParameterExplanationData Type
in_table

The input table containing the field(s) that will be used to calculate statistics. The input can be an INFO table, a dBASE table, an OLE DB table, a VPF table, or a feature class.

Table View; Raster Layer
out_table

The output dBASE or geodatabase table that will store the calculated statistics.

Table
statistics_fields
[[field, {statistic_type}],...]
(Optional)

The numeric field containing attribute values used to calculate the specified statistic. Multiple statistic and field combinations may be specified. Null values are excluded from all statistical calculations.

Text attribute fields can be summarized using first and last statistics. Numeric attribute fields can be summarized using any statistic.

Available statistics types are as follows:

  • SUM—Adds the total value for the specified field.
  • MEAN—Calculates the average for the specified field.
  • MIN—Finds the smallest value for all records of the specified field.
  • MAX—Finds the largest value for all records of the specified field.
  • RANGE—Finds the range of values (maximum minus minimum) for the specified field.
  • STD—Finds the standard deviation on values in the specified field.
  • COUNT—Finds the number of values included in statistical calculations. This counts each value except null values. To determine the number of null values in a field, create a count on the field in question, create a count on a different field that does not contain null values (for example, the OID if present), and subtract the two values.
  • FIRST—Finds the first record in the input and uses its specified field value.
  • LAST—Finds the last record in the input and uses its specified field value.
  • MEDIAN—Calculates the median for all records of the specified field.
  • VARIANCE—Calculates the variance for all records of the specified field.
Value Table
case_field
[case_field,...]
(Optional)

The fields in the input used to calculate statistics separately for each unique attribute value (or combination of attribute values when multiple fields are specified).

Field

Code sample

Statistics example (Python window)

The following Python window script demonstrates how to use the Statistics tool in immediate mode.

import arcpy
arcpy.env.workspace = "C:/data/Habitat_Analysis.gdb"
arcpy.Statistics_analysis("futrds", "C:/output/output.gdb/stats", [["Shape_Length", "SUM"]], "NM")
Statistics example 2 (stand-alone script)

The following stand-alone script summarizes the vegetation by area within 150 feet of major roads.

# Description: Summarize the vegetation by area within 150 feet of major roads
 
# Import system modules
import arcpy
 
# Set environment settings
arcpy.env.workspace = "C:/data"
 
# Set local variables
inRoads = "majorrds.shp"
outBuffer = "C:/output/output.gdb/buffer_out"
bufferDistance = "250 feet"
inVegetation = "Habitat_Analysis.gdb/vegtype"
outClip = "C:/output/output.gdb/clip_out"
joinField = "HOLLAND95"
joinTable = "c:/data/vegtable.dbf"
joinedField = "HABITAT"
outStatsTable = "C:/output/output.gdb/stats_out"
statsFields = [["Shape_Area", "SUM"]]
 
# Execute Buffer to get a buffer of major roads
arcpy.Buffer_analysis(inRoads, outBuffer, bufferDistance, dissolve_option="ALL")
 
# Execute Clip using the buffer output to get a clipped feature class
#  of vegetation
arcpy.Clip_analysis(inVegetation, outBuffer, outClip)
 
# Execute JoinField to add the vegetation type
arcpy.JoinField_management(outClip, joinField, joinTable, joinField, joinedField)
 
# Execute Statistics to get the area of each vegetation type within
#  the clipped buffer.
arcpy.Statistics_analysis(outClip, outStatsTable, statsFields, joinedField)
Statistics example 3 (stand-alone script)

The following stand-alone script loops through the attribute fields of a dataset and constructs the statistics_fields parameter so that the SUM statistic is calculated for every numeric field.

# Description: Script that runs the Summary Statistic tool to calculate the
#   Sum statistic for every numeric field based on a unique case field

# Import system modules
import arcpy

# Set environment settings
arcpy.env.workspace = "C:/data/f.gdb"

# Set local variables
intable = "intable"
outtable = "sumstats"
casefield = "Name"
stats = []

# Loop through all fields in the Input Table
for field in arcpy.ListFields(intable):

    # Just find the fields that have a numeric type
    if field.type in ("Double", "Integer", "Single", "SmallInteger"):
        # Add the field name and Sum statistic type
        #    to the list of fields to summarize
        stats.append([field.name, "Sum"])
# Correct formatting of stats [["Field1", "Sum"], ["Field2", "Sum"], ...]

# Run the Summary Statistics tool with the stats list
arcpy.Statistics_analysis(intable, outtable, stats, casefield)
Statistics example 4 (stand-alone script)

The following script uses a pandas DataFrame to access and display the tabular results of the Statistics tool.

import arcpy
import pandas
import os

arcpy.env.overwriteOutput = True

in_table = r"d:\data\states.shp"
out_table = r"in_memory\stats_table"
stat_fields = [['POP1990', 'SUM'], ['POP1997', 'SUM']]

stats = arcpy.Statistics_analysis(in_table, out_table, stat_fields,
                                  case_field='SUB_REGION')

# Get a list of field names to display
field_names = [i.name for i in arcpy.ListFields(out_table) if i.type != 'OID']

# Open a cursor to extract results from stats table
cursor = arcpy.da.SearchCursor(out_table, field_names)

# Create a pandas dataframe to display results
df = pandas.DataFrame(data=[row for row in cursor],
                      columns=field_names)

print(df)

Licensing information

  • ArcGIS Desktop Basic: Yes
  • ArcGIS Desktop Standard: Yes
  • ArcGIS Desktop Advanced: Yes

Related topics