Label | Explanation | Data Type |
Input Table | The input table containing the fields that will be used to calculate statistics. | Table View; Raster Layer |
Output Table | The output table that will store the calculated statistics. | Table |
Statistics Fields | Specifies the field or fields containing the attribute values that will be used to calculate the specified statistic. Multiple statistic and field combinations can be specified. Null values are excluded from all calculations. Text attribute fields can be summarized using first and last statistics. Numeric attribute fields can be summarized using any statistic. Date, Date only, and Timestamp offset attribute fields can be summarized only with the mean, minimum, maximum, count, first, last, unique, and concatenate statistics. Available statistics types are as follows:
| Value Table |
Case Field (Optional) | The field or fields in the input that will be used to calculate statistics separately for each unique attribute value (or combination of attribute values when multiple fields are specified). | Field |
Concatenation Separator
(Optional) | A character or characters that will be used to concatenate values when the Concatenation option is used for the Statistics Fields parameter. By default, the tool will concatenate values without a separator. | String |
Summary
Calculates summary statistics for fields in a table.
Usage
The Output Table parameter value will consist of fields containing the result of the statistical operation.
The following statistical operations are available with this tool: sum, mean, minimum, maximum, range, standard deviation, count, first, last, median, variance, unique and concatenate.
A field will be created for each statistic type using the following naming convention: SUM_<field>, MEAN_<field>, MIN_<field>, MAX_<field>, RANGE_<field>, STD_<field>, COUNT_<field>, FIRST_<field>, LAST_<field>, MEDIAN_<field>, VARIANCE_<field>, UNIQUE_<field>, and CONCATENATE_<field> in which <field> is the name of the input field for which the statistic is computed. The field name is truncated to 10 characters when the output table is a dBASE table.
If the Case Field parameter value is specified, statistics will be calculated separately for each unique attribute value, and there will be one record for each unique Case Field value. The Output Table parameter value will contain only one record if no Case Field parameter value is specified.
Null values are excluded from all statistical calculations. For example, the average of 10, 5, and a null value is 7.5 ((10+5)/2).
When using layers, only the currently selected features will be used to calculate statistics.
Parameters
arcpy.analysis.Statistics(in_table, out_table, statistics_fields, {case_field}, {concatenation_separator})
Name | Explanation | Data Type |
in_table | The input table containing the fields that will be used to calculate statistics. | Table View; Raster Layer |
out_table | The output table that will store the calculated statistics. | Table |
statistics_fields [[field, {statistic_type}],...] | Specifies the field or fields containing the attribute values that will be used to calculate the specified statistic. Multiple statistic and field combinations can be specified. Null values are excluded from all calculations. Text attribute fields can be summarized using first and last statistics. Numeric attribute fields can be summarized using any statistic. Date, Date only, and Timestamp offset attribute fields can be summarized only with the mean, minimum, maximum, count, first, last, unique, and concatenate statistics. Available statistics types are as follows:
| Value Table |
case_field [case_field,...] (Optional) | The field or fields in the input that will be used to calculate statistics separately for each unique attribute value (or combination of attribute values when multiple fields are specified). | Field |
concatenation_separator (Optional) | A character or characters that will be used to concatenate values when the CONCATENATION option is used for the statistics_fields parameter. By default, the tool will concatenate values without a separator. | String |
Code sample
The following Python window script demonstrates how to use the Statistics function in immediate mode.
import arcpy
arcpy.env.workspace = "C:/data/Habitat_Analysis.gdb"
arcpy.analysis.Statistics("futrds", "C:/output/output.gdb/stats", [["Shape_Length", "SUM"]], "NM")
The following stand-alone script summarizes the vegetation by area within 150 feet of major roads.
# Description: Summarize the vegetation by area within 150 feet of major roads
# Import system modules
import arcpy
# Set environment settings
arcpy.env.workspace = "C:/data"
# Set local variables
inRoads = "majorrds.shp"
outBuffer = "C:/output/output.gdb/buffer_out"
bufferDistance = "250 feet"
inVegetation = "Habitat_Analysis.gdb/vegtype"
outClip = "C:/output/output.gdb/clip_out"
joinField = "HOLLAND95"
joinTable = "c:/data/vegtable.dbf"
joinedField = "HABITAT"
outStatsTable = "C:/output/output.gdb/stats_out"
statsFields = [["Shape_Area", "SUM"]]
# Run Buffer to get a buffer of major roads
arcpy.analysis.Buffer(inRoads, outBuffer, bufferDistance, dissolve_option="ALL")
# Run Clip using the buffer output to get a clipped feature class of
# vegetation
arcpy.analysis.Clip(inVegetation, outBuffer, outClip)
# Run JoinField to add the vegetation type
arcpy.management.JoinField(outClip, joinField, joinTable, joinField, joinedField)
# Run Statistics to get the area of each vegetation type within the
# clipped buffer.
arcpy.analysis.Statistics(outClip, outStatsTable, statsFields, joinedField)
The following stand-alone script loops through the attribute fields of a dataset and constructs the statistics_fields parameter so that the SUM statistic is calculated for every numeric field.
# Description: Script that runs the Summary Statistics tool to calculate the
# Sum statistic for every numeric field based on a unique case
# field.
# Import system modules
import arcpy
# Set environment settings
arcpy.env.workspace = "C:/data/f.gdb"
# Set local variables
intable = "intable"
outtable = "sumstats"
casefield = "Name"
stats = []
# Loop through all fields in the Input Table
for field in arcpy.ListFields(intable):
# Find the fields that have a numeric type
if field.type in ("Double", "Integer", "Single", "SmallInteger"):
# Add the field name and Sum statistic type to the list of fields to
# summarize
stats.append([field.name, "Sum"])
# Correct formatting of stats [["Field1", "Sum"], ["Field2", "Sum"], ...]
# Run Statistics with the stats list
arcpy.analysis.Statistics(intable, outtable, stats, casefield)
The following script uses a pandas DataFrame to access and display the tabular results of the Statistics function.
import arcpy
import pandas
import os
arcpy.env.overwriteOutput = True
in_table = r"d:\data\states.shp"
out_table = r"in_memory\stats_table"
stat_fields = [['POP1990', 'SUM'], ['POP1997', 'SUM']]
stats = arcpy.analysis.Statistics(in_table, out_table, stat_fields,
case_field='SUB_REGION')
# Get a list of field names to display
field_names = [i.name for i in arcpy.ListFields(out_table) if i.type != 'OID']
# Open a cursor to extract results from stats table
cursor = arcpy.da.SearchCursor(out_table, field_names)
# Create a pandas DataFrame to display results
df = pandas.DataFrame(data=[row for row in cursor],
columns=field_names)
print(df)
Environments
Special cases
- Time Zone
The mean statistic type on a Timestamp offset field will use the timezone offset from this environment.
Licensing information
- Basic: Yes
- Standard: Yes
- Advanced: Yes