Emerging Hot Spot Analysis (Space Time Pattern Mining)—ArcGIS Pro

Summary

Identifies trends in the clustering of point densities (counts) or values in a space-time cube created using either the Create Space Time Cube By Aggregating Points, Create Space Time Cube From Defined Locations or Create Space Time Cube from Multidimensional Raster Layer tool. Categories include new, consecutive, intensifying, persistent, diminishing, sporadic, oscillating, and historical hot and cold spots.

Learn more about how the Emerging Hot Spot Analysis tool works

Illustration

Usage

This tool can only accept netCDF files created by the Create Space Time Cube By Aggregating Points, Create Space Time Cube From Defined Locations tool or Create Space Time Cube from Multidimensional Raster Layer tools.
Each bin in the space-time cube has a LOCATION_ID, time_step_ID, COUNT value, and any Summary Fields or Variables that were aggregated when the cube was created. Bins associated with the same physical location will share the same location ID and together will represent a time series. Bins associated with the same time-step interval will share the same time-step ID and together will comprise a time slice. The count value for each bin reflects the number of incidents or records that occurred at the associated location within the associated time-step interval.
This tool analyzes a variable in the netCDF Input Space Time Cube using a space-time implementation of the Getis-Ord Gi* statistic.
The Output Features will be added to the Contents pane with rendering that summarizes results of the space-time analysis for all locations analyzed. If you specify a Polygon Analysis Mask, the locations analyzed will be those that fall within the analysis mask; otherwise, the locations analyzed will be those with at least one point for at least one time-step interval.
In addition to the Output Features, an analysis summary is written as messages at the bottom of the Geoprocessing pane during tool execution. You can access the messages by hovering over the progress bar, clicking the pop-out button , or expanding the details section of the messages in the Geoprocessing pane. You can also access the messages for a previously run tool via the Geoprocessing History.
The Emerging Hot Spot Analysis tool can detect eight specific hot or cold spot trends: new, consecutive, intensifying, persistent, diminishing, sporadic, oscillating, and historical. See Learn more about how the Emerging Hot Spot Analysis tool works for output category definitions and additional information about the algorithms this tool employs.
To get a measure of the intensity of feature clustering, this tool uses a space-time implementation of the Getis-Ord Gi* statistic, which considers the value for each bin within the context of the values for neighboring bins.
To determine which bins will be included in each analysis neighborhood, the tool first finds neighboring bins that fall within the specified Conceptualization of Spatial Relationships. Next, for each of those bins, it includes bins at those same locations from N previous time steps, where N is the Neighborhood Time Step value you specify.
Your choice for the Conceptualization of Spatial Relationships parameter should reflect inherent relationships among the features you are analyzing. The more realistically you can model how features interact with each other in space, the more accurate your results will be. Recommendations are outlined in Selecting a Conceptualization of Spatial Relationships: Best Practices.
The default Conceptualization of Spatial Relationships is Fixed distance. A bin is considered a neighbor if its centroid falls within the Neighborhood Distance and its time interval is within the Neighborhood Time Step you specify. When you do not provide a Neighborhood Distance value, one is calculated for you based on the spatial distribution of your point data. When you do not provide a Neighborhood Time Step value, the tool uses a default value of 1 time-step interval.
The Number of Neighbors parameter may override the Neighborhood Distance for the Fixed distance option or extend the neighbor search for the Contiguity edges only and Contiguity edges corners options. In these cases, Number of Neighbors is used as a minimum number. For instance, if you specify Fixed distance with a Neighborhood Distance of 10 miles and 3 for the Number of Neighbors parameter, all bins will receive a minimum of 3 spatial neighbors even if the Neighborhood Distance has to be increased to find them. The distance is only increased for those bins where the minimum Number of Neighbors is not met. Likewise, with the contiguity options, for bins with fewer than this number of contiguous neighbors, additional neighbors will be chosen based on centroid proximity.
The Neighborhood Time Step value is the number of time-step intervals to include in the analysis neighborhood. If the time-step interval for your cube is three months, for example, and you specify 2 for the Neighborhood Time Step, all bin counts within the Neighborhood Distance, and all of their associated bins for the previous two time-step intervals (covering a nine-month time period), will be included in the analysis neighborhood.
The Polygon Analysis Mask feature layer can include one or more polygons defining the analysis study area. These polygons indicate where point features could possibly occur and should exclude areas where points would be impossible. If you were analyzing residential burglary trends, for example, you might use the Polygon Analysis Mask to exclude a large lake, regional parks, or other areas where there aren't any homes.
The Polygon Analysis Mask is intersected with the extent of the Input Space Time Cube and will not extend the dimensions of the cube.
If the Polygon Analysis Mask that you are using to set your study area covers an area beyond the extent of the input features that were used when initially creating the cube, you may want to re-create your cube using that Polygon Analysis Mask as the Output extent environment. This will ensure that all of the area covered by the Polygon Analysis Mask is included when running the Emerging Hot Spot Analysis tool. Using the Polygon Analysis Mask as the Output extent environment setting during cube creation will ensure that the extent of the cube matches the extent of the Polygon Analysis Mask.

Running Emerging Hot Spot Analysis adds some analysis results back to the netCDF Input Space Time Cube. Three analyses are performed:

Each bin is analyzed within the context of neighboring bins to measure how intense clustering is for both high and low values. The result from this analysis is a z-score, p-value, and binning category for every bin in the space-time cube.
The time series of these z-scores at the locations analyzed is then assessed using the Mann-Kendall statistic. The result from this analysis is a clustering trend z-score, p-value, and binning category for each location.
Finally, the time series of the values at the locations analyzed is assessed using the Mann-Kendall statistic. The result from this analysis is a trend z-score, p-value, and binning category for each location.

A summary of the variables added to the Input Space Time Cube is given below:


Variable Name	Description	Dimension
EMERGING_{ANALYSIS_VARIABLE}_HS_PVALUE	Getis-Ord Gi* statistic p-value measuring the statistical significance of high value (hot spot) and low value (cold spot) clustering.	Three dimensions: one p-value for every bin in the space-time cube.
EMERGING_{ANALYSIS_VARIABLE}_HS_ZSCORE	Getis-Ord Gi* statistic z-score measuring the intensity of high value (hot spot) and low value (cold spot) clustering.	Three dimensions: one z-score for every bin in the space-time cube.
EMERGING_{ANALYSIS_VARIABLE}_HS_BIN	The result category used to classify each bin as a statistically significant hot or cold spot value. The bin is based on an FDR correction. -3: cold spot, 99 percent confidence -2: cold spot, 95 percent confidence -1: cold spot, 90 percent confidence 0: not a statistically significant hot or cold spot 1: hot spot, 90 percent confident 2: hot spot, 95 percent confident 3: hot spot, 99 percent confident	Three dimensions: one binning category for every bin in the space-time cube. The bin is based on an FDR correction.
{ANALYSIS_VARIABLE}_TREND_PVALUE	The Mann-Kendall p-value measuring statistical significance of the trend in values at a location.	Two dimensions: one p-value for each location analyzed.
{ANALYSIS_VARIABLE}_TREND_ZSCORE	The z-score measuring the Mann-Kendall trend, up or down, associated with the values at a location. A positive z-score indicates an upward trend; a negative z-score indicates a downward trend.	Two dimensions: one z-score for each location analyzed.
{ANALYSIS_VARIABLE}_TREND_BIN	The result category used to classify each location as having a statistically significant upward or downward trend for the values. -3: down trend, 99 percent confidence -2: down trend, 95 percent confidence -1: down trend, 90 percent confidence 0: no significant trend 1: up trend, 90 percent confident 2: up trend, 95 percent confident 3: up trend, 99 percent confident	Two dimensions: one binning category for each location analyzed.
EMERGING_{ANALYSIS_VARIABLE}_TREND_PVALUE	The Mann-Kendall p-value measuring statistical significance of the trend in hot/cold spot z-scores at a location.	Two dimensions: one p-value for each location analyzed.
EMERGING_{ANALYSIS_VARIABLE}_TREND_ZSCORE	The z-score measuring the Mann-Kendall trend, up or down, associated with the trend in hot/cold spot z-scores at a location. A positive z-score indicates an upward trend; a negative z-score indicates a downward trend.	Two dimensions: one z-score for each location analyzed.
EMERGING_{ANALYSIS_VARIABLE}_TREND_BIN	The result category used to classify each location as having a statistically significant upward or downward trend for hot/cold spot z-scores. -3: down trend, 99 percent confidence -2: down trend, 95 percent confidence -1: down trend, 90 percent confidence 0: no significant trend 1: up trend, 90 percent confident 2: up trend, 95 percent confident 3: up trend, 99 percent confident	Two dimensions: one binning category for each location analyzed.
EMERGING_{ANALYSIS_VARIABLE}_CATEGORY	One of 17 categories, 1 to 8, 0, and -1 to -8. 1, new hot spot 2, consecutive hot spot 3, intensifying hot spot 4, persistent hot spot 5, diminishing hot spot 6, sporadic hot spot 7, oscillating hot spot 8, historical hot spot 0, no pattern detected -1, new cold spot -2, consecutive cold spot -3, intensifying cold spot -4, persistent cold spot -5, diminishing cold spot -6, sporadic cold spot -7, oscillating cold spot -8, historical cold spot	Two dimensions: one category for each location analyzed.

Parameters

Label	Explanation	Data Type
Input Space Time Cube	The netCDF cube to be analyzed. This file must have an (.nc) extension and must have been created using the Create Space Time Cube By Aggregating Points tool or the Create Space Time Cube From Defined Locations tool.	File
Analysis Variable	The numeric variable in the netCDF file you want to analyze.	String
Output Features	The output feature class results. This feature class will be a two-dimensional map representation of the hot and cold spot trends in your data. It will show, for example, any new or intensifying hot spots.	Feature Class
Neighborhood Distance (Optional)	The spatial extent of the analysis neighborhood. This value determines which features are analyzed together in order to assess local space-time clustering.	Linear Unit
Neighborhood Time Step (Optional)	The number of time-step intervals to include in the analysis neighborhood. This value determines which features are analyzed together in order to assess local space-time clustering.	Long
Polygon Analysis Mask (Optional)	A polygon feature layer with one or more polygons defining the analysis study area. You would use a polygon analysis mask to exclude a large lake from the analysis, for example. Bins defined in the Input Space Time Cube that fall outside of the mask will not be included in the analysis. This parameter is only available for grid cubes.	Feature Layer
Conceptualization of Spatial Relationships (Optional)	Specifies how spatial relationships among features are defined. Fixed distance—Each bin is analyzed within the context of neighboring bins. Neighboring bins inside the specified critical distance (Neighborhood Distance) receive a weight of one and exert influence on computations for the target bin. Neighboring bins outside the critical distance receive a weight of zero and have no influence on a target bin's computations. K nearest neighbors—The closest k bins are included in the analysis for the target bin; k is a specified numeric parameter. Contiguity edges only—Only neighboring bins that share an edge will influence computations for the target polygon bin. Contiguity edges corners—Bins that share an edge or share a node will influence computations for the target polygon bin.	String
Number of Spatial Neighbors (Optional)	An integer specifying either the minimum or the exact number of neighbors to include in calculations for the target bin. For K nearest neighbors, each bin will have exactly this specified number of neighbors. For Fixed distance, each bin will have at least this many neighbors (the threshold distance will be temporarily extended to ensure this many neighbors if necessary). When one of the contiguity conceptualizations are selected, each bin will be assigned this minimum number of neighbors. For bins with fewer than this number of contiguous neighbors, additional neighbors will be based on feature centroid proximity.	Long
Define Global Window (Optional)	The statistic works by comparing a local statistic calculated from the neighbors for each bin to a global value. This parameter can be used to control which bins are used to calculate the global value. Entire cube—Each neighborhood is analyzed in comparison to the entire cube. This is the default. Neighborhood Time Step—Each neighborhood is analyzed in comparison to the bins contained within the Neighborhood Time Step specified. Individual time step—Each neighborhood is analyzed in comparison to the bins in the same time step.	String

arcpy.stpm.EmergingHotSpotAnalysis(in_cube, analysis_variable, output_features, {neighborhood_distance}, {neighborhood_time_step}, {polygon_mask}, {conceptualization_of_spatial_relationships}, {number_of_neighbors}, {define_global_window})

Name	Explanation	Data Type
in_cube	The netCDF cube to be analyzed. This file must have an (.nc) extension and must have been created using the Create Space Time Cube By Aggregating Points tool or the Create Space Time Cube From Defined Locations tool.	File
analysis_variable	The numeric variable in the netCDF file you want to analyze.	String
output_features	The output feature class results. This feature class will be a two-dimensional map representation of the hot and cold spot trends in your data. It will show, for example, any new or intensifying hot spots.	Feature Class
neighborhood_distance (Optional)	The spatial extent of the analysis neighborhood. This value determines which features are analyzed together in order to assess local space-time clustering.	Linear Unit
neighborhood_time_step (Optional)	The number of time-step intervals to include in the analysis neighborhood. This value determines which features are analyzed together in order to assess local space-time clustering.	Long
polygon_mask (Optional)	A polygon feature layer with one or more polygons defining the analysis study area. You would use a polygon analysis mask to exclude a large lake from the analysis, for example. Bins defined in the in_cube that fall outside of the mask will not be included in the analysis. This parameter is only available for grid cubes.	Feature Layer
conceptualization_of_spatial_relationships (Optional)	Specifies how spatial relationships among bins are defined. FIXED_DISTANCE—Each bin is analyzed within the context of neighboring bins. Neighboring bins inside the specified critical distance (neighborhood_distance) receive a weight of one and exert influence on computations for the target bin. Neighboring bins outside the critical distance receive a weight of zero and have no influence on a target bin's computations. K_NEAREST_NEIGHBORS—The closest k bins are included in the analysis for the target bin; k is a specified numeric parameter. CONTIGUITY_EDGES_ONLY—Only neighboring bins that share an edge will influence computations for the target polygon bin. CONTIGUITY_EDGES_CORNERS—Bins that share an edge or share a node will influence computations for the target polygon bin.	String
number_of_neighbors (Optional)	An integer specifying either the minimum or the exact number of neighbors to include in calculations for the target bin. For K_NEAREST_NEIGHBORS, each bin will have exactly this specified number of neighbors. For FIXED_DISTANCE_BAND, each bin will have at least this many neighbors (the threshold distance will be temporarily extended to ensure this many neighbors if necessary). When one of the contiguity conceptualizations are selected, each bin will be assigned this minimum number of neighbors. For bins with fewer than this number of contiguous neighbors, additional neighbors will be based on feature centroid proximity.	Long
define_global_window (Optional)	The statistic works by comparing a local statistic calculated from the neighbors for each bin to a global value. This parameter can be used to control which bins are used to calculate the global value. ENTIRE_CUBE—Each neighborhood is analyzed in comparison to the entire cube. This is the default. NEIGHBORHOOD_TIME_STEP—Each neighborhood is analyzed in comparison to the bins contained within the Neighborhood Time Step specified. INDIVIDUAL_TIME_STEP—Each neighborhood is analyzed in comparison to the bins in the same time step.	String

Code sample

EmergingHotSpotAnalysis example 1 (Python window)

The following Python window script demonstrates how to use the EmergingHotSpotAnalysis tool.

import arcpy
arcpy.env.workspace = r"C:\STPM"
arcpy.EmergingHotSpotAnalysis_stpm("Homicides.nc", "COUNT", "EHS_Homicides.shp", "5 Miles", 2, "#", "FIXED_DISTANCE", "3")

EmergingHotSpotAnalysis example 2 (stand-alone script)

The following stand-alone Python script demonstrates how to use the EmergingHotSpotAnalysis tool.

# Create Space Time Cube of homicide incidents in a metropolitan area

# Import system modules
import arcpy

# Set property to overwrite existing output, by default
arcpy.env.overwriteOutput = True

# Local variables...
workspace = r"C:\STPM"

try:
    # Set the current workspace (to avoid having to specify the full path to the feature 
    # classes each time)
    arcpy.env.workspace = workspace

    # Create Space Time Cube of homicide incident data with 3 months and 3 miles settings
    # Process: Create Space Time Cube 
    cube = arcpy.CreateSpaceTimeCube_stpm("Homicides.shp", "Homicides.nc", "MyDate", "#", 
                                          "3 Months", "End time", "#", "3 Miles", "Property MEDIAN SPACETIME; Age STD ZEROS",
																																										"HEXAGON_GRID")

    # Create a polygon that defines where incidents are possible  
    # Process: Minimum Bounding Geometry of homicide incident data
    arcpy.MinimumBoundingGeometry_management("Homicides.shp", "bounding.shp", "CONVEX_HULL",
                                             "ALL", "#", "NO_MBG_FIELDS")

    # Emerging Hot Spot Analysis of homicide incident cube using 5 Miles neighborhood 
    # distance and 2 neighborhood time step to detect hot spots
    # Process: Emerging Hot Spot Analysis 
    cube = arcpy.EmergingHotSpotAnalysis_stpm("Homicides.nc", "COUNT", "EHS_Homicides.shp", 
                                              "5 Miles", 2, "bounding.shp", "FIXED_DISTANCE", "3")

except arcpy.ExecuteError:
    # If any error occurred when running the tool, print the messages
    print(arcpy.GetMessages())

Environments

Current Workspace, Scratch Workspace, Output Coordinate System, Geographic Transformations

Licensing information

Basic: Yes
Standard: Yes
Advanced: Yes