# Summarize Center And Dispersion (GeoAnalytics)

## Summary

Finds central features and directional distributions and calculates mean and median locations from the input.

## Usage

• This tool can be used for centrality and dispersion of features. The following are examples of situations when using this tool is beneficial:

• A local government is planning to open a new library for an underserved community. Centroids from block groups with the appropriate zoning and available lots have been collected. Calculating a central feature with a weight on population can be used to identify the central block group that will best serve the community.
• A GIS analyst is analyzing the locations of 911 calls and the locations of emergency response stations (police, fire, and ambulance). A mean center result can be used to compare the mean center of the emergency calls and the mean center of the response stations to optimize response time.
• A crime analyst wants to determine if the median center for burglaries shifts when evaluating daytime versus nighttime incidents. Calculating a median center with a group by time of day can be used to determine where crimes are occurring during the day and at night.
• A GIS analyst for a nongovernmental organization is analyzing the spread of an infectious disease. An ellipse can be used to model the dispersion of the outbreak.

• The Weight Field parameter is used to weight locations according to their relative importance. For example, stores in a retail chain can be weighted by total sales, or polygon features can be weighted by their area. See Using weights to learn more about how weights are applied in analysis.

• The Group By Field parameter is used to group features for separate computations of central features or dispersion. For example, wildlife observations taken throughout the year can be grouped by season or month. The field can be of integer, date, or string type. Records with null values will be grouped together.

• The central feature is the feature associated with the smallest accumulated distance to all other features in the dataset. This feature is identified and included in the Central Feature Layer output. It is possible to have more than one feature sharing the smallest accumulated distance to all other features. If this happens, all of the most centrally located features are included in the Central Feature Layer output. When a Group By Field parameter value is specified, the input features are first grouped according to the field values; then a central feature is identified for each group.

• The mean center is a point constructed from the average x and y coordinates. The mean center features are included in the Mean Center Layer output. When a Group By Field value is specified, the input features are first grouped according to the field values; then the mean center is calculated for each group.

• Median center uses an iterative algorithm to find the geometric median point that minimizes Euclidean distance to all features in the dataset. The median center features are included in the Median Center Layer output. When a Group By Field value is specified, the input features are first grouped according to the field values; then the median center is calculated for each group. Unlike the results of the mean center operation, the median center results are less influenced by outlier features.

• Standard deviational ellipses are created to summarize the spatial characteristics of geographic features: central tendency, dispersion, and directional trends. The ellipses can be sized as 1, 2, or 3 standard deviations. The ellipse features are included in the Ellipse Layer output. When a Group By Field value is specified, the input features are first grouped according to the field values; then an ellipse is calculated for each group.

• You can specify one or more summary types to output. Each summary type will be output to a unique feature layer.

• If the input layer has features with null values for time or geometry, the features will not be used in analysis.

• If the input layer is time enabled, the results will not represent the temporal center. Only spatial considerations are made when calculating central tendency and dispersion.

• In addition to the fields from the input layer, the output Central Feature summary type result will include the following fields:

Field nameDescription

CoordX

The x-coordinate of the central feature. If the feature is a line or polygon, the value will represent the centroid of the feature.

CoordY

The y-coordinate of the central feature. If the feature is a line or polygon, the value will represent the centroid of the feature.

instant_datetime

If the input layer is time enabled with time type instant, the output result will include an instant date field representing the time of the output feature.

• In addition to the optional Group By Field parameter value used in analysis, the output Mean Center and Median Center summary type results will include the following fields:

Field nameDescription

CoordX

The x-coordinate of the mean or median feature.

CoordY

The y-coordinate of the mean or median feature.

instant_datetime

If the input layer is time enabled with time type instant, the output result will include an instant date field representing the time of the output feature.

• In addition to the optional Group By Field parameter value used in analysis, the output Ellipse summary type will include the following fields:

Field nameDescription

CenterX

The x-coordinate for the mean center of the ellipse.

CenterY

The y-coordinate for the mean center of the ellipse.

CenterT

The time value for the mean center of the ellipse.

Rotation

The rotation of the long axis measured clockwise from noon. This value is measured in degrees.

MajStdDist

The standard distance for the semi-major axis. This value is measured in degrees.

MinStdDist

The standard distance for the semi-minor axis. This value is measured in degrees.

TmStdDist

The temporal standard distance. This value is a duration measured in milliseconds.

• Coordinate value attributes, for example CoordX and CoordY, will be calculated using the spatial reference of the analysis. By default, the spatial reference of the analysis will be the same as the input layer. Optionally, you can specify the spatial reference used in the analysis using the Output Coordinate System environment variable.

If you are writing results to the spatiotemporal data store, the result features will be represented by the WGS 1984 (WKID 4326) coordinate system. This means the geometry values of your result features may be stored in different coordinate systems than the output attribute values. For example, if you output a mean center layer to the spatiotemporal data store and specify the Output Coordinate System environment value of NAD 1983 UTM Zone 1N (WKID 26901), the calculated values for the CoordX and CoordY fields will be in NAD 1983 UTM Zone 1N (WKID 26901), but the features on the map will be in the WGS 1984 (WKID 4326) coordinate system.

• You can improve the performance of the Summarize Center And Dispersion tool by doing one or more of the following:

• Set the extent environment so that you only analyze data of interest.
• Use data that is local to where the analysis is being run.
• Group your data using the Group By Field parameter.
• For larger datasets, Median Center may be the least performant summary type due to is iterative calculations.

• This geoprocessing tool is powered by ArcGIS GeoAnalytics Server. Analysis is completed on your GeoAnalytics Server, and results are stored in your content in ArcGIS Enterprise.

• When running GeoAnalytics Server tools, the analysis is completed on the GeoAnalytics Server. For optimal performance, make data available to the GeoAnalytics Server through feature layers hosted on your ArcGIS Enterprise portal or through big data file shares. Data that is not local to your GeoAnalytics Server will be moved to your GeoAnalytics Server before analysis begins. This means that it will take longer to run a tool, and in some cases, moving the data from ArcGIS Pro to your GeoAnalytics Server may fail. The threshold for failure depends on your network speeds, as well as the size and complexity of the data. Therefore, it is recommended that you always share your data or create a big data file share.

## Syntax

`arcpy.geoanalytics.SummarizeCenterAndDispersion(input_layer, output_name, generate_types, {ellipse_size}, {weight_field}, {group_by_field}, {data_store})`
 Parameter Explanation Data Type input_layer The point layer to be summarized. Feature Set output_name The name of the output feature service. String generate_types[generate_types,...] Specifies the summary types to be generated. You can use one or more summary types. A unique layer will be created for each summary type selected.CENTRAL_FEATURE —A layer will be created that contains a copy of the most central feature from the input layer.MEAN_CENTER —A point layer will be created that represents the mean center of the input layer.MEDIAN_CENTER —A point layer will be created that represents the median center of the input layer.ELLIPSE —A polygon layer will be created that represents the directional ellipse of the input layer. String ellipse_size(Optional) Specifies the size of output ellipses in standard deviations. 1_STANDARD_DEVIATION —Output ellipses will cover one standard deviation of the input features. This is the default.2_STANDARD_DEVIATIONS —Output ellipses will cover two standard deviations of the input features.3_STANDARD_DEVIATIONS —Output ellipses will cover three standard deviations of the input features. String weight_field(Optional) A numeric field used to weight locations according to their relative importance. This applies to all summary types. Field group_by_field(Optional) The field used to group similar features. This applies to all summary types. For example, if you choose a field named PlantType that contains values of tree, bush and grass, all of the features with the value tree will be analyzed for their own center or dispersion. This example will result in three features, one for each group of tree, bush and grass. Field data_store(Optional) Specifies the ArcGIS Data Store where the output will be saved. The default is SPATIOTEMPORAL_DATA_STORE. All results stored in a spatiotemporal big data store will be stored in WGS84. Results stored in a relational data store will maintain their coordinate system.SPATIOTEMPORAL_DATA_STORE —Output will be stored in a spatiotemporal big data store. This is the default.RELATIONAL_DATA_STORE —Output will be stored in a relational data store. String

#### Derived Output

 Name Explanation Data Type out_central_feature_layer The layer containing the central feature from the input layer. Feature Class out_mean_center_layer The point layer containing the mean center representations of the input layer. Feature Class out_median_center_layer The point layer containing the median center representations of the input layer. Feature Class out_ellipse_layer The polygon layer containing the ellipse representations of the input layer. Feature Class

## Code sample

SummarizeCenterAndDispersion (stand-alone script)

The following stand-alone script demonstrates how to use the SummarizeCenterAndDispersion tool.

``````# Name: SummarizeCenterAndDispersion.py
# Description: Calculate a standard deviational ellipse of contagious disease
#              data to understand the spread of the disease over time.
#
# Requirements: ArcGIS GeoAnalytics Server

# Import system modules
import arcpy

# Set local variables
# This example calculates a standard deviational ellipse for 3 standard
# deviations of the data
inFeatures = "https://sampleserver6.com/arcgis/rest/services/DataStoreCatalogs/bigDataFileShares_myBDFS/BigDataCatalogServer/diseaseRecords"
outFS = "disease_movement_ellipse"
summaryType = "ELLIPSE"
dataStore = "RELATIONAL_DATA_STORE"

# Execute SummarizeCenterAndDispersion
arcpy.geoanalytics.SummarizeCenterAndDispersion(inFeatures, outFS, summaryType,
"3_STANDARD_DEVIATIONS", "",
"", "", "", "" dataStore)``````

## Environments

Output Coordinate System

The coordinate system that will be used for analysis. Analysis will be completed in the input coordinate system unless specified by this parameter. For GeoAnalytics Tools, final results will be stored in the spatiotemporal data store in WGS84.

## Licensing information

• Basic: Requires ArcGIS GeoAnalytics Server
• Standard: Requires ArcGIS GeoAnalytics Server
• Advanced: Requires ArcGIS GeoAnalytics Server