Generate Spatial Weights Matrix (Spatial Statistics)

Summary

Generates a spatial weights matrix file (.swm) to represent the spatial relationships among features in a dataset.

Learn more about how Generate Spatial Weights Matrix works

Illustration

Generate Spatial Weights Matrix tool illustration
These spatial relationships are based on polygon contiguity, Queen's case: shared edges or nodes.

Usage

  • The output from this tool is a spatial weights matrix file (.swm). Tools, such as the Hot Spot Analysis tool, that require you to specify a Conceptualization of Spatial Relationships parameter value will accept a spatial weights matrix file. Choose the Get spatial weights from file option for the Conceptualization of Spatial Relationships parameter, and provide the full path to the spatial weights file created using this tool for the Weights Matrix File parameter.

  • This tool also reports characteristics of the resultant spatial weights matrix file: number of features, connectivity, and minimum, maximum, and average number of neighbors. This summary is written as messages at the bottom of the Geoprocessing pane during tool processing. To access the messages, hover over the progress bar, click the pop-out button or expand the messages section in the Geoprocessing pane. You can also access messages for a previously run tool through the geoprocessing history. This summary will indicate if all features have at least one neighbor. In general, especially with large datasets, a minimum of eight neighbors and a low value for feature connectivity is best.

  • For space and time analyses, choose the Space time window option for the Conceptualization of Spatial Relationships parameter. You define space by specifying a Threshold Distance value; you define time by specifying a Date/Time Field value and both a Date/Time Type (such as hours or days) and a Date/Time Interval Value value. The Date/Time Interval Value parameter value is an integer. For example, if you enter 1000 feet, choose the Hours option, and provide a Date/Time Interval Value value of 3, features within 1,000 feet and occurring within three hours of each other will be considered neighbors.

  • The spatial weights matrix file (.swm) allows you to generate, store, reuse, and share your conceptualization of the relationships among a set of features. To improve performance, the file is created in a binary file format. Feature relationships are stored as a sparse matrix, so only nonzero relationships are written to the .swm file. In general, tools will perform well even when the .swm file contains more than 15 million nonzero relationships. However, if a memory error occurs when using the .swm file, reconsider how you are defining the feature relationships. As a best practice, aim for a spatial weights matrix in which every feature has at least 1 neighbor, most have 8 neighbors, and no feature has more than 1,000 neighbors.

  • Coincident points are not used in the calculation of the default Threshold Distance parameter value.

  • When using data with coordinates that include a z-value, the Threshold Distance parameter value will be a 3D distance.

  • When using data with coordinates that include a z-value, the only options supported by the Conceptualization of Spatial Relationships parameter are Inverse Distance, Fixed Distance, K nearest neighbors, and Space time window.

  • If the Input Feature Class parameter value is z-enabled, the linear units of the vertical coordinate system (VCS) must match the linear units of the horizontal coordinate system. If the Input Feature Class parameter value does not have a VCS, it is assumed the vertical linear unit is the same as the horizontal linear unit.

  • When the Input Feature Class parameter value is not projected (that is, when coordinates are given in degrees, minutes, and seconds) or when the output coordinate system is set to a geographic coordinate system, distances are computed using chordal measurements. Chordal distance measurements are used because they can be computed quickly and provide good estimates of true geodesic distances, at least for points within about 30 degrees of each other. Chordal distances are based on an oblate spheroid. Given any two points on the earth's surface, the chordal distance between them is the length of a line, passing through the three-dimensional earth, to connect those two points. Chordal distances are reported in meters.

    Caution:

    Project the data if the study area extends beyond 30 degrees. Chordal distances are not a good estimate of geodesic distances beyond 30 degrees.

  • When chordal distances are used in the analysis, the Threshold Distance parameter value, if specified, should be in meters.

  • For line and polygon features, feature centroids are used in distance computations. For multipoints, polylines, or polygons with multiple parts, the centroid is computed using the weighted mean center of all feature parts. The weighting for point features is 1, for line features is length, and for polygon features is area.

  • The Unique ID Field parameter value is linked to feature relationships derived from running this tool. Consequently, the Unique ID Field values must be unique for every feature and typically should be in a permanent field that remains with the feature class. If you don't have a unique ID field, you can create one by adding a new integer field (Add Field) to the feature class table and calculating the field values to be equal to the FID or OBJECTID field (Calculate Field). Because the FID and OBJECTID field values may change when you copy or edit a feature class, you cannot use these fields directly for the Unique ID Field parameter.

  • The Number of Neighbors parameter may override the Threshold Distance parameter for inverse or fixed distance conceptualizations of spatial relationships. For example, if you specify a threshold distance of 10 miles and a value of 3 for the Number of Neighbors parameter, all features will receive a minimum of 3 neighbors, even if the distance threshold must be increased to find them. The threshold distance is only increased in cases in which the minimum number of neighbors is not met.

  • The Convert table option for the Conceptualization of Spatial Relationships parameter can be used to convert an ASCII spatial weights matrix file to a SWM formatted spatial weights matrix file. First, put the ASCII weights into a formatted table (using Excel, for example).

    Caution:

    If the table includes weights for self-potential, they will be omitted from the .swm output file, and the default self-potential value will be used in analyses. The default self-potential value for the Hot Spot Analysis tool is one, but this value can be overwritten by specifying a Self-Potential Field value. For all other tools, the default self-potential value is zero.

  • For polygon features, check the Row Standardization parameter. Row standardization mitigates bias when the number of neighbors for each feature is a function of the aggregation scheme or sampling process, rather than reflecting the actual spatial distribution of the variable you are analyzing.

  • The Modeling Spatial Relationships help topic provides additional information about this tool's parameters.

  • The tools that can use a spatial weights matrix file project feature geometry to the output coordinate system prior to analysis and all mathematical computations are based on the output coordinate system. Consequently, if the output coordinate system setting does not match the input feature class spatial reference, either make sure, for all analyses using the spatial weights matrix file, that the output coordinate system matches the settings used when the spatial weights matrix file was created or project the input feature class so it matches the spatial reference associated with the spatial weights matrix file.

  • Caution:

    When using shapefiles, keep in mind that they cannot store null values. Tools or other procedures that create shapefiles from nonshapefile inputs may store or interpret null values as zero. In some cases, nulls are stored as very large negative values in shapefiles. This can lead to unexpected results. See Geoprocessing considerations for shapefile output for more information.

Parameters

LabelExplanationData Type
Input Feature Class

The feature class for which spatial relationships of features will be assessed.

Feature Class
Unique ID Field

An integer field containing a different value for every feature in the input feature class. If you don't have a Unique ID field, you can create one by adding an integer field to your feature class table and calculating the field values to equal the FID or OBJECTID field.

Field
Output Spatial Weights Matrix File

The full path for the output spatial weights matrix file (.swm).

File
Conceptualization of Spatial Relationships

Specifies how spatial relationships among features will be conceptualized.

  • Inverse distanceThe impact of one feature on another feature will decrease with distance.
  • Fixed distanceEverything within a specified critical distance of each feature will be included in the analysis. Everything outside the critical distance will be excluded.
  • K nearest neighborsThe closest k features will be included in the analysis; k is a specified numeric parameter.
  • Contiguity edges onlyPolygon features that share a boundary will be neighbors.
  • Contiguity edges cornersPolygon features that share a boundary or share a node will be neighbors.
  • Delaunay triangulationA mesh of nonoverlapping triangles will be created from feature centroids, and features associated with triangle nodes that share edges will be neighbors.
  • Space time windowFeatures within a specified critical distance and specified time interval of each other will be neighbors.
  • Convert tableSpatial relationships will be defined in a table.
String
Distance Method
(Optional)

Specifies how distances will be calculated from each feature to neighboring features.

  • EuclideanThe straight-line distance between two points (as the crow flies) will be calculated. This is the default.
  • ManhattanThe distance between two points measured along axes at right angles (city block) will be calculated by summing the (absolute) difference between the x- and y-coordinates.
String
Exponent
(Optional)

The value for inverse distance calculation. A typical value is 1 or 2.

Double
Threshold Distance
(Optional)

The cutoff distance for the Conceptualization of Spatial Relationships parameter's Inverse distance and Fixed distance options. Enter this value using the units specified in the environment output coordinate system. This defines the size of the space window for the Space time window option.

When this parameter is left blank, a default threshold value is computed based on the output feature class extent and the number of features. For the inverse distance conceptualization of spatial relationships, a value of zero indicates that no threshold distance will be applied and all features will be neighbors of every other feature.

Double
Number of Neighbors
(Optional)

An integer reflecting either the minimum or the exact number of neighbors. When the Conceptualization of Spatial Relationships parameter is set to K nearest neighbors, each feature will have exactly this specified number of neighbors. For the Inverse distance or Fixed distance option, each feature will have at least this many neighbors (the threshold distance will be temporarily extended to ensure this many neighbors, if necessary). When the Contiguity edges only or Contiguity edges corners option is chosen, each polygon will be assigned this minimum number of neighbors. For polygons with fewer than this number of contiguous neighbors, additional neighbors will be based on feature centroid proximity.

Long
Row Standardization
(Optional)

Specifies whether spatial weights will be standardized by row. Row standardization is recommended whenever feature distribution is potentially biased due to sampling design or to an imposed aggregation scheme.

  • Checked—Spatial weights will be standardized by row. Each weight is divided by its row sum. This is the default.
  • Unchecked—No standardization of spatial weights will be applied.
Boolean
Input Table
(Optional)

A table containing numeric weights relating every feature to every other feature in the input feature class. Required fields for the table are the Unique ID Field parameter value, NID (neighbor ID), and WEIGHT.

Table
Date/Time Field
(Optional)

A date field with a time stamp for each feature.

Field
Date/Time Interval Type
(Optional)

Specifies the units that will be used for measuring time.

  • SecondsThe unit will be seconds.
  • MinutesThe unit will be minutes.
  • HoursThe unit will be hours.
  • DaysThe unit will be days.
  • WeeksThe unit will be weeks.
  • MonthsThe unit will be 30 days.
  • YearsThe unit will be years.
String
Date/Time Interval Value
(Optional)

An integer reflecting the number of time units comprising the time window.

For example, if you choose Hours for the Date/Time Interval Type parameter and specify 3 for the Date/Time Interval Value parameter, the time window will be 3 hours. Features within the specified space window and within the specified time window will be neighbors.

Long
Use Z values

Specifies whether z-coordinates will be used in the construction of the spatial weights matrix if the input features are z-enabled.

  • Checked—Z-values will be used in the construction of the spatial weights matrix.
  • Unchecked—Z-values will not be used. They will be ignored and only x- and y-coordinates will be considered in the construction of the spatial weights matrix. This is the default.

Boolean

arcpy.stats.GenerateSpatialWeightsMatrix(Input_Feature_Class, Unique_ID_Field, Output_Spatial_Weights_Matrix_File, Conceptualization_of_Spatial_Relationships, {Distance_Method}, {Exponent}, {Threshold_Distance}, {Number_of_Neighbors}, {Row_Standardization}, {Input_Table}, {Date_Time_Field}, {Date_Time_Interval_Type}, {Date_Time_Interval_Value}, Use_Z_values)
NameExplanationData Type
Input_Feature_Class

The feature class for which spatial relationships of features will be assessed.

Feature Class
Unique_ID_Field

An integer field containing a different value for every feature in the input feature class. If you don't have a Unique ID field, you can create one by adding an integer field to your feature class table and calculating the field values to equal the FID or OBJECTID field.

Field
Output_Spatial_Weights_Matrix_File

The full path for the output spatial weights matrix file (.swm).

File
Conceptualization_of_Spatial_Relationships

Specifies how spatial relationships among features will be conceptualized.

  • INVERSE_DISTANCEThe impact of one feature on another feature will decrease with distance.
  • FIXED_DISTANCEEverything within a specified critical distance of each feature will be included in the analysis. Everything outside the critical distance will be excluded.
  • K_NEAREST_NEIGHBORSThe closest k features will be included in the analysis; k is a specified numeric parameter.
  • CONTIGUITY_EDGES_ONLYPolygon features that share a boundary will be neighbors.
  • CONTIGUITY_EDGES_CORNERSPolygon features that share a boundary or share a node will be neighbors.
  • DELAUNAY_TRIANGULATIONA mesh of nonoverlapping triangles will be created from feature centroids, and features associated with triangle nodes that share edges will be neighbors.
  • SPACE_TIME_WINDOWFeatures within a specified critical distance and specified time interval of each other will be neighbors.
  • CONVERT_TABLESpatial relationships will be defined in a table.
String
Distance_Method
(Optional)

Specifies how distances will be calculated from each feature to neighboring features.

  • EUCLIDEANThe straight-line distance between two points (as the crow flies) will be calculated. This is the default.
  • MANHATTANThe distance between two points measured along axes at right angles (city block) will be calculated by summing the (absolute) difference between the x- and y-coordinates.
String
Exponent
(Optional)

The value for inverse distance calculation. A typical value is 1 or 2.

Double
Threshold_Distance
(Optional)

The cutoff distance for the Conceptualization_of_Spatial_Relationships parameter's INVERSE_DISTANCE and FIXED_DISTANCE options. Enter this value using the units specified in the environment output coordinate system. This defines the size of the space window for the SPACE_TIME_WINDOW option.

When this parameter is left blank, a default threshold value is computed based on the output feature class extent and the number of features. For the inverse distance conceptualization of spatial relationships, a value of zero indicates that no threshold distance will be applied and all features will be neighbors of every other feature.

Double
Number_of_Neighbors
(Optional)

An integer reflecting either the minimum or the exact number of neighbors. When the Conceptualization_of_Spatial_Relationships parameter is set to K_NEAREST_NEIGHBORS, each feature will have exactly this specified number of neighbors. For the INVERSE_DISTANCE or FIXED_DISTANCE option, each feature will have at least this many neighbors (the threshold distance will be temporarily extended to ensure this many neighbors, if necessary). When the CONTIGUITY_EDGES_ONLY or CONTIGUITY_EDGES_CORNERS option is chosen, each polygon will be assigned this minimum number of neighbors. For polygons with fewer than this number of contiguous neighbors, additional neighbors will be based on feature centroid proximity.

Long
Row_Standardization
(Optional)

Specifies whether spatial weights will be standardized by row. Row standardization is recommended whenever feature distribution is potentially biased due to sampling design or to an imposed aggregation scheme.

  • ROW_STANDARDIZATIONSpatial weights will be standardized by row. Each weight is divided by its row sum. This is the default.
  • NO_STANDARDIZATIONNo standardization of spatial weights will be applied.
Boolean
Input_Table
(Optional)

A table containing numeric weights relating every feature to every other feature in the input feature class. Required fields for the table are the Unique ID Field parameter value, NID (neighbor ID), and WEIGHT.

Table
Date_Time_Field
(Optional)

A date field with a time stamp for each feature.

Field
Date_Time_Interval_Type
(Optional)

Specifies the units that will be used for measuring time.

  • SECONDSThe unit will be seconds.
  • MINUTESThe unit will be minutes.
  • HOURSThe unit will be hours.
  • DAYSThe unit will be days.
  • WEEKSThe unit will be weeks.
  • MONTHSThe unit will be 30 days.
  • YEARSThe unit will be years.
String
Date_Time_Interval_Value
(Optional)

An integer reflecting the number of time units comprising the time window.

For example, if you choose HOURS for the Date_Time_Interval_Type parameter and specify 3 for the Date_Time_Interval_Value parameter, the time window will be 3 hours. Features within the specified space window and within the specified time window will be neighbors.

Long
Use_Z_values

Specifies whether z-coordinates will be used in the construction of the spatial weights matrix if the input features are z-enabled.

  • USE_Z_VALUESZ-values will be used in the construction of the spatial weights matrix.
  • DO_NOT_USE_Z_VALUESZ-values will not be used. They will be ignored and only x- and y-coordinates will be considered in the construction of the spatial weights matrix. This is the default.
Boolean

Code sample

GenerateSpatialWeightsMatrix example 1 (Python window)

The following Python window script demonstrates how to use the GenerateSpatialWeightsMatrix function.

import arcpy
arcpy.env.workspace = "C:/data"
arcpy.GenerateSpatialWeightsMatrix_stats("911Count.shp", "MYID", 
                                         "euclidean6Neighs.swm", 
                                         "K_NEAREST_NEIGHBORS", "#", "#", "#", 
                                         6, "NO_STANDARDIZATION")
GenerateSpatialWeightsMatrix example 2 (stand-alone script)

The following stand-alone Python script demonstrates how to use the GenerateSpatialWeightsMatrix function.


# Analyze the spatial distribution of 911 calls in a metropolitan area
# using the Hot-Spot Analysis Tool (Local Gi*)

# Import system modules
import arcpy

# Set property to overwrite existing output, by default
arcpy.env.overwriteOutput = True

# Local variables...
workspace = "C:/Data"

try:
    # Set the current workspace (to avoid having to specify the full path to the feature classes each time)
    arcpy.env.workspace = workspace

    # Copy the input feature class and integrate the points to snap
    # together at 500 feet
    # Process: Copy Features and Integrate
    cf = arcpy.CopyFeatures_management("911Calls.shp", "911Copied.shp",
                         "#", 0, 0, 0)

    integrate = arcpy.Integrate_management("911Copied.shp #", "500 Feet")

    # Use Collect Events to count the number of calls at each location
    # Process: Collect Events
    ce = arcpy.CollectEvents_stats("911Copied.shp", "911Count.shp", "Count", "#")

    # Add a unique ID field to the count feature class
    # Process: Add Field and Calculate Field
    af = arcpy.AddField_management("911Count.shp", "MyID", "LONG", "#", "#", "#", "#",
                     "NON_NULLABLE", "NON_REQUIRED", "#",
                     "911Count.shp")
    
    cf = arcpy.CalculateField_management("911Count.shp", "MyID", "[FID]", "VB")

    # Create Spatial Weights Matrix for Calculations
    # Process: Generate Spatial Weights Matrix... 
    swm = arcpy.GenerateSpatialWeightsMatrix_stats("911Count.shp", "MYID",
                        "euclidean6Neighs.swm",
                        "K_NEAREST_NEIGHBORS",
                        "#", "#", "#", 6,
                        "NO_STANDARDIZATION") 

    # Hot Spot Analysis of 911 Calls
    # Process: Hot Spot Analysis (Getis-Ord Gi*)
    hs = arcpy.HotSpots_stats("911Count.shp", "ICOUNT", "911HotSpots.shp", 
                     "GET_SPATIAL_WEIGHTS_FROM_FILE",
                     "EUCLIDEAN_DISTANCE", "NONE",
                     "#", "#", "euclidean6Neighs.swm")

except arcpy.ExecuteError:
    # If an error occurred when running the tool, print the error message
    print(arcpy.GetMessages())

Environments

Special cases

Output Coordinate System

Feature geometry is projected to the output coordinate system prior to analysis, so values entered for the Threshold Distance parameter should match those specified in the output coordinate system. All mathematical computations are based on the spatial reference of the output coordinate system. When the output coordinate system is based on degrees, minutes, and seconds, geodesic distances are estimated using chordal distances in meters.

Licensing information

  • Basic: Yes
  • Standard: Yes
  • Advanced: Yes

Related topics