How Kernel Density works

Available with Spatial Analyst license.

The Kernel Density tool calculates the density of features in a neighborhood around those features. It can be calculated for both point and line features.

Possible uses include analyzing density of houses or crimes for community planning, or exploring how roads or utility lines influence a wildlife habitat. The population field could be used to weight some features more heavily than others, or to allow one point to represent several observations. For example, one address might represent a condominium with six units, or some crimes might be weighted more heavily than others in determining overall crime levels. For line features, a divided highway may have more impact than a narrow dirt road.

How Kernel Density is calculated

For point features

Kernel Density calculates the density of point features around each output raster cell.

Conceptually, a smoothly curved surface is fitted over each point. The surface value is highest at the location of the point and diminishes with increasing distance from the point, reaching zero at the Search radius distance from the point. Only a circular neighborhood is possible. The volume under the surface equals the Population field value for the point, or 1 if NONE is specified. The density at each output raster cell is calculated by adding the values of all the kernel surfaces where they overlay the raster cell center. The kernel function is based on the quartic kernel function described in Silverman (1986, p. 76, equation 4.5).

If a population field setting other than NONE is used, each item's value determines the number of times to count the point. For example, a value of 3 would cause the point to be counted as three points. The values can be integer or floating point.

By default, a unit is selected based on the linear unit of the projection definition of the input point feature data or as otherwise specified in the Output Coordinate System environment setting. If an area unit is selected, the calculated density for the cell is multiplied by the appropriate factor before it is written to the output raster.

For example, if the input units are meters, the output area units will default to square kilometers. Comparing a unit scale factor of meters to kilometers will result in the values being different by a multiplier of 1,000,000 (1,000 meters x 1,000 meters).

For line features

Kernel Density can also calculate the density of linear features in the neighborhood of each output raster cell.

Conceptually, a smoothly curved surface is fitted over each line. Its value is greatest on the line and diminishes as you move away from the line, reaching zero at the specified Search radius distance from the line. The surface is defined so the volume under the surface equals the product of line length and the Population field value. The density at each output raster cell is calculated by adding the values of all the kernel surfaces where they overlay the raster cell center. The use of the kernel function for lines is adapted from the quartic kernel function for point densities as described in Silverman. A line segment and the kernel surface fitted over it.

The illustration above shows a line segment and the kernel surface fitted over it. The contribution of the line segment to density is equal to the value of the kernel surface at the raster cell center.

By default, a unit is selected based on the linear unit of the projection definition of the input polyline feature data or as otherwise specified in the Output Coordinate System environment setting.

When an output Area units factor is specified, it converts the units of both length and area. For example, if the linear unit is meters, the output area units will default to Square kilometers and the resulting line density units will convert to kilometers per square kilometer. The end result, comparing an area scale factor of meters to kilometers, will be the density values being different by a multiplier of 1,000.

You can control the density units by manually selecting the appropriate factor. To set the density to be in meters per square meter (instead of the default of kilometers per square kilometer), set the area units to Square meters. Similarly, to have the density units of your output in miles per square mile, set the area units to Square miles.

Refer to the following topic for more details on specific distance units.

If a population field other than NONE is used, the length of the line is considered to be its actual length multiplied by the value of the population field for that line.

Formulas for calculating Kernel Density

The following formulas define how the Kernel density for points is calculated and how the default search radius is determined within the Kernel density formula.

Predicting the density for points

The predicted density at a new (x, y) location is determined by the following formula: where:

• i = 1,…,n are the input points. Only include points in the sum if they are within the radius distance of the (x, y) location.
• popi is the population field value of point I, which is an optional parameter.
• disti is the distance between point i and the (x, y) location.

The calculated density is then multiplied by the number of points, or the sum of the population field if one was provided. This correction makes the spatial integral equal to the number of points (or sum or population field) rather than always being equal to 1. This implementation uses a Quartic kernel (Silverman, 1986). The formula will need to be calculated for every location where you want to estimate the density. Since a raster is being created, the calculations are applied to the center of every cell in the output raster.

Default search radius (bandwidth)

The algorithm used to determine the default search radius, also known as the bandwidth, is as follows:

1. Calculate the mean center of the input points. If a Population field was provided, this, and all the folllowing calculations, will be weighted by the values in that field.
2. Calculate the distance from the (weighted) mean center for all points.
3. Calculate the (weighted) median of these distances, Dm.
4. Calculate the (weighted) Standard Distance, SD.

See the Standard Distance Spatial Statistics tool for more details on this.

5. Apply the following formula to calculate the bandwidth. where:

• Dm is the (weighted) median distance from (weighted) mean center.
• n is the number of points if no population field is used, or if a population field is supplied, n is the sum of the population field values.
• SD is the standard distance.

Note that the min part of the equation means that which ever of the two options, either SD or , that results in a smaller value will be used.

There are two methods for calculating the standard distance, Unweighted and Weighted.

Unweighted distance

where:

• x i , y i and z i are the coordinates for feature i
• {x̄, ȳ, z̄} represents the Mean Center for the features
• n is equal to the total number of features.

Weighted distance where:

• wi is the weight at feature i
• {x w, y w, z w} represents the weighted Mean Center

Methodology

This methodology for choosing the search radius is based on Silverman's Rule-of-thumb bandwidth estimation formula but it has been adapted for two dimensions. This approach to calculating a default radius generally avoids the "ring around the points" phenomenon that often occurred with sparse datasets, and is resistant to spatial outliers- a few points that are far away from the rest of the points.

References

Silverman, B. W. Density Estimation for Statistics and Data Analysis. New York: Chapman and Hall, 1986.