Summarize Center And Dispersion (GeoAnalytics Desktop)—ArcGIS Pro

Summary

Finds central features and directional distributions and calculates mean and median locations from the input.

Illustration

Usage

This tool can be used for centrality and dispersion of features. The following are examples of situations when using this tool is beneficial:
- A local government is planning to open a new library for an underserved community. Centroids from block groups with the appropriate zoning and available lots have been collected. Calculating a central feature with a weight on population can be used to identify the central block group that will best serve the community.
- A GIS analyst is analyzing the locations of 911 calls and the locations of emergency response stations (police, fire, and ambulance). Calculating a mean center can be used to compare the mean center of the emergency calls and the mean center of the response stations to optimize response time.
- A crime analyst wants to determine if the median center for burglaries shifts when evaluating daytime versus nighttime incidents. A median center calculated with a group by hour of day can be used to determine where crimes are occurring during the day and at night.
- A GIS analyst for a nongovernmental organization is analyzing the spread of an infectious disease. An ellipse can be used to model the dispersion of the outbreak.
For input line and polygon features, feature centroids are used in distance computations.
The Weight Field parameter is used to weight locations according to their relative importance. For example, stores in a retail chain can be weighted by total sales, or polygon features can be weighted by their area. See Using weights to learn more about how weights are applied in analysis.
The Group By Field parameter is used to group features for separate computations of central features or dispersion. For example, wildlife observations taken throughout the year can be grouped by season or month. The field can be of integer, date, or string type. Records with null values will be grouped together.
The central feature is the feature associated with the smallest accumulated distance to all other features in the dataset. This feature is identified and included in the Central Feature Layer output. It is possible to have more than one feature sharing the smallest accumulated distance to all other features. If this happens, all of the most centrally located features are included in the Central Feature Layer output. When a Group By Field parameter value is specified, the input features are first grouped according to the field values; then a central feature is identified for each group. The geometry type of the output central feature will be the same as the input features.
The mean center is a point constructed from the average x- and y-coordinates. The mean center features are included in the Mean Center Layer output. When a Group By Field value is specified, the input features are first grouped according to the field values; then the mean center is calculated for each group.
Median center uses an iterative algorithm to find the geometric median point that minimizes Euclidean distance to all features in the dataset. The median center features are included in the Median Center Layer output. When a Group By Field value is specified, the input features are first grouped according to the field values; then the median center is calculated for each group. Unlike the results of the mean center operation, the median center results are less influenced by outlier features.
Standard deviational ellipses are created to summarize the spatial characteristics of geographic features: central tendency, dispersion, and directional trends. The ellipses can be sized as 1, 2, or 3 standard deviations. The ellipse features are included in the Ellipse Layer output. When a Group By Field value is specified, the input features are first grouped according to the field values; then an ellipse is calculated for each group.
You can specify one or more summary types to output. Each summary type will be output to a unique feature layer.
If the input layer includes features with null values for time or geometry, those features will not be used in analysis.

In addition to the fields from the input layer, the Output Central Feature parameter result will include the following fields:


Field name	Description
CoordX	The x-coordinate of the central feature. If the feature is a line or polygon, the value will represent the centroid of the feature.
CoordY	The y-coordinate of the central feature. If the feature is a line or polygon, the value will represent the centroid of the feature.
date	If the input layer is time enabled with time type instant, the output result will include an instant date field representing the time of the output feature.
start_date	If the input layer is time enabled with time type interval, the output result will include a start date field representing the start time of the output feature.
end_date	If the input layer is time enabled with time type interval, the output result will include an end date field representing the end time of the output feature.

In addition to the optional Group By Field parameter value used in analysis, the Output Mean Center and Output Median Center parameter results will include the following fields:


Field name	Description
CoordX	The x-coordinate of the mean or median feature.
CoordY	The y-coordinate of the mean or median feature.
date	If the input layer is time enabled, the output result will include an instant date field representing the mean or median time of the input features. This applies to input layers of both interval and instant time types.

In addition to the optional Group By Field parameter value used in analysis, the output Ellipse summary type will include the following fields:


Field name	Description
CenterX	The x-coordinate for the mean center of the ellipse.
CenterY	The y-coordinate for the mean center of the ellipse.
CenterT	The time value for the mean center of the ellipse.
Rotation	The rotation of the long axis measured clockwise from noon. The rotation is measured in the units of the input's spatial reference. For example, a projected dataset could be measured in meters, and a geographic dataset could be measured in degrees.
MajStdDist	The standard distance for the major axis. The rotation is measured in the units of the input's spatial reference. For example, a dataset with a projected spatial reference could be measured in meters, and a dataset with a geographic spatial reference could be measured in degrees.
MinStdDist	The standard distance for the minor axis. The rotation is measured in the units of the input's spatial reference. For example, a dataset with a projected spatial reference could be measured in meters, and a dataset with a geographic spatial reference could be measured in degrees.
TmStdDist	The temporal standard distance. This value is a duration measured in milliseconds.

All output coordinate values will be calculated using the spatial reference of the analysis. By default, the spatial reference of the analysis will be the same as the input layer. Optionally, you can specify the spatial reference used in the analysis using the Output Coordinate System environment variable.
You can improve the performance of the Summarize Center And Dispersion tool by doing one or more of the following:
- Set the extent environment so that you only analyze data of interest.
- Use data that is local to where the analysis is being run.
- Group your data using the Group By Field parameter.
- For larger datasets, use Median Center for the Generate Types parameter, as it may be the least performant summary type due to is iterative calculations.
Similar analysis can also be completed using the following Spatial Statistics tools:
- Find the central feature using the Central Feature tool.
- Calculate the mean center using the Mean Center tool.
- Calculate median center using the Median Center tool.
- Calculate an ellipse using the Directional Distribution (Standard Deviational Ellipse) tool.
This geoprocessing tool is powered by Spark. Analysis is completed on your desktop machine using multiple cores in parallel. See Considerations for GeoAnalytics Desktop tools to learn more about running analysis.
When running GeoAnalytics Desktop tools, the analysis is completed on your desktop machine. For optimal performance, data should be available on your desktop. If you are using a hosted feature layer, it is recommended that you use ArcGIS GeoAnalytics Server. If your data isn't local, it will take longer to run a tool. To use your ArcGIS GeoAnalytics Server to perform analysis, see GeoAnalytics Tools.

Parameters

Label	Explanation	Data Type
Input Layer	The point, line, or polygon layer to be summarized.	Feature Layer
Output Central Feature	The output feature class that will contain the most centrally located feature in the input layer.	Feature Class
Output Mean Center (Optional)	The output point feature class that will contain features representing the mean centers of the input layer.	Feature Class
Output Median Center (Optional)	The output point feature class that will contain features representing the median centers of the input layer.	Feature Class
Output Ellipse (Optional)	The output polygon feature class that will contain the directional ellipse representation of the input layer.	Feature Class
Ellipse Size (Optional)	Specifies the size of output ellipses in standard deviations. One standard deviation—Output ellipses will cover one standard deviation of the input features. This is the default. Two standard deviations—Output ellipses will cover two standard deviations of the input features. Three standard deviations—Output ellipses will cover three standard deviations of the input features.	String
Weight Field (Optional)	A numeric field used to weight locations according to their relative importance. This applies to all summary types.	Field
Group By Field (Optional)	The field used to group similar features. This applies to all summary types. For example, if you choose a field named PlantType that contains values of tree, bush, and grass, all of the features with the value tree will be analyzed for their own center or dispersion. This example will result in three features, one for each group of tree, bush, and grass.	Field

arcpy.gapro.SummarizeCenterAndDispersion(input_layer, out_central_feature, {out_mean_center}, {out_median_center}, {out_ellipse}, {ellipse_size}, {weight_field}, {group_by_field})

Name	Explanation	Data Type
input_layer	The point, line, or polygon layer to be summarized.	Feature Layer
out_central_feature	The output feature class that will contain the most centrally located feature in the input layer.	Feature Class
out_mean_center (Optional)	The output point feature class that will contain features representing the mean centers of the input layer.	Feature Class
out_median_center (Optional)	The output point feature class that will contain features representing the median centers of the input layer.	Feature Class
out_ellipse (Optional)	The output polygon feature class that will contain the directional ellipse representation of the input layer.	Feature Class
ellipse_size (Optional)	Specifies the size of output ellipses in standard deviations. 1_STANDARD_DEVIATION—Output ellipses will cover one standard deviation of the input features. This is the default. 2_STANDARD_DEVIATIONS—Output ellipses will cover two standard deviations of the input features. 3_STANDARD_DEVIATIONS—Output ellipses will cover three standard deviations of the input features.	String
weight_field (Optional)	A numeric field used to weight locations according to their relative importance. This applies to all summary types.	Field
group_by_field (Optional)	The field used to group similar features. This applies to all summary types. For example, if you choose a field named PlantType that contains values of tree, bush, and grass, all of the features with the value tree will be analyzed for their own center or dispersion. This example will result in three features, one for each group of tree, bush, and grass.	Field

Code sample

SummarizeCenterAndDispersion (stand-alone script)

The following stand-alone script demonstrates how to use the SummarizeCenterAndDispersion function.

# Name: SummarizeCenterAndDispersion.py
# Description: Calculate the directionality and movement of fire occurrences 
#              over time. This sample calculates a mean center and a standard 
#              deviational ellipse.
# Requirements: ArcGIS Pro Advanced license 

# Import system modules
import arcpy

# Set local variables
inFeatures = r"c:\data\MyBigDataConnection.bdc\fire_incidents"
outMeanCenter = r"c:\data\FireIncidents.gdb\fires_meancenter"
outEllipse = r"c:\data\FireIncidents.gdb\fires_ellipse"


# Run SummarizeCenterAndDispersion
arcpy.gapro.SummarizeCenterAndDispersion(inFeatures, "", outMeanCenter, "", 
                                         outEllipse, "2_STANDARD_DEVIATIONS")

Environments

Output Coordinate System, Extent, Current Workspace

Licensing information

Basic: No
Standard: No
Advanced: Yes