Understanding overlay analysis—ArcGIS Pro

Available with Spatial Analyst license.

Overlay analysis is a group of methodologies applied in optimal site selection or suitability modeling. It is a technique for applying a common scale of values to diverse and dissimilar inputs to create an integrated analysis.

Suitability models identify the best or most preferred locations for a specific phenomenon. Types of problems addressed by suitability analysis include the following:

Where to site a new housing development
Which sites are better for deer habitat
Where economic growth is most likely to occur
Where the locations are that are most susceptible to mudslides

Overlay analysis often requires the analysis of many different factors. For instance, choosing the site for a new housing development means assessing such things as land cost, proximity to existing services, slope, and flood frequency. This information exists in different rasters with different value scales: dollars, distances, degrees, and so on. You cannot add a raster of land cost (dollars) to a raster of distance to utilities (meters) and obtain a meaningful result.

Additionally, the factors in your analysis may not be equally important. It may be that the cost of land is more important in choosing a site than the distance to utility lines. How much more important is for you to decide.

Even within a single raster, you must prioritize values. Some values in a particular raster may be ideal for your purposes (for example, slopes of 0 to 5 degrees), while others may be good, others bad, and still others unacceptable.

The following lists the general steps to perform overlay analysis:

Define the problem.
Break the problem into submodels.
Determine significant layers.
Reclassify or transform the data within a layer.
Weight the input layers.
Add or combine the layers.
Select the best locations.
Analyze.

Steps 1 through 3 are common steps for nearly all spatial problem solving and are particularly important in overlay analysis.

1. Define the problem

Defining the problem is one of the most difficult aspects of the modeling process. The overall objective must be identified. All aspects of the remaining steps of the overlay modeling process must contribute to this overall objective.

The components relating to the objective must be defined. Some of the components may be complementary and others competitive. However, a clear definition of each component and how they interact must be established.

Not only is it important to identify what the problem is, a clear understanding needs to be developed to define when the problem is solved, or when the phenomenon is satisfied. In the problem definition, specific measures should be established to identify the success of the outcome from the model.

For example, when identifying the best location for a ski resort, the overall goal may be to make money. All factors that are identified in the model should help the ski area be profitable.

2. Break the problem into submodels

Most overlay problems are complex and it is recommended that you break them down into submodels for clarity, to organize your thoughts, and to more effectively solve the overlay problem.

For example, a suitability model for identifying the best location for a ski resort can be broken into a series of submodels that should help the ski area be profitable. The first submodel can be a terrain submodel identifying locations that have a wide variety of favorable terrain for skiers and snowboarders.

Making sure people can reach the ski area can be captured in an accessibility submodel. Included in the submodel can be access from major cities as well as local road access.

A cost submodel can identify the locations that would be optimal to build on. This submodel may identify flatter slopes as well as those close to power and water as being favorable.

Certain attributes or layers can be in multiple submodels. For example, steep slopes might be favorable in the terrain submodel but detrimental for the cost of building submodels.

3. Determine significant layers

The attributes or layers that affect each submodel need to be identified. Each factor captures and describes a component of the phenomena the submodel is defining. Each factor contributes to the goals of the submodel, and each submodel contributes to the overall goal of the overlay model. All, and only factors that contribute to defining the phenomenon, should be included in the overlay model.

For certain factors, the layers may need to be created. For example, it may be more desirable to be closer to a major road. To identify the distance each cell is from a road, Euclidean Distance may be run to create the distance raster.

4. Reclassification/transformation

Different number systems cannot be directly combined effectively. For example, adding slope to land use would produce meaningless results. The four main numbering systems are the following:

Ratio—The ratio scale has a reference point, usually zero, and the numbers within the scale are comparable. For example, elevation values are ratio numbers, and an elevation of 50 meters is half as high as 100 meters.
Interval—The values in an interval scale are relative to one another; however, there is not a common reference point. For example, a pH scale is of type interval, where the higher the value is above the neutral value of 7, the more alkaline it is, and the lower the value is below 7, the more acidic it is. However, the values are not fully comparable. For example, a pH of 2 is not twice as acidic as a pH of 4.
Ordinal—An ordinal scale establishes order, such as who came in first, second, and third in a race. Order is established, but the assigned order values cannot be directly compared. For example, the person who came in first was not necessarily twice as fast as the person who came in second.
Nominal—There is no relationship between the assigned values in the nominal scale. For example, land-use values, which are nominal values, cannot be compared to one another. A land use of 8 is probably not twice as much as a land use of 4.

Because of the potential different ranges of values and the different types of numbering systems each input layer may have, before the multiple factors can be combined for analysis, each must be reclassified or transformed to a common ratio scale.

Common scales can be predetermined, such as a 1 to 9 or a 1 to 10 scale, with the higher value being more favorable, or the scale can be on a 0 to 1 scale, defining the possibility of belonging to a specific set.

5. Weight

Certain factors may be more important to the overall goal than others. If this is the case, before the factors are combined, the factors can be weighted based on their importance. For example, in the building submodel for siting the ski resort, the slope criteria may be twice as important to the cost of construction as the distance from a road. Therefore, before combining the two layers, the slope criteria should be multiplied twice as much as distance to roads.

6. Add/Combine

In overlay analysis, it is desirable to establish the relationship of all the input factors together to identify the desirable locations that meet the goals of the model. For example, the input layers, once weighted appropriately, can be added together in an additive weighted overlay model. In this combination approach, it is assumed that the more favorable the factors, the more desirable the location will be. Thus, the higher the value on the resulting output raster, the more desirable the location will be.

Other combining approaches can be applied. For example, in a fuzzy logic overlay analysis, the combination approaches explore the possibility of membership of a location to multiple sets.

7. Select the best locations

In most overlay analysis and suitability models, identifying the best locations for the phenomenon you are modeling is the ultimate goal. This phenomenon will have specific size and spatial requirements to function effectively. These requirements include the total area necessary to function, the number of regions this area should be distributed among, the shape characteristics for the regions, and the minimum and maximum distance between the regions.

The Locate Regions tool allows you to identify the best combinations of desired regions that meet the defined spatial constraints.

8. Analyze

The final step in the modeling process is for you to analyze the results. Do the potential ideal locations sensibly meet the criteria? It may be beneficial not only to explore the best locations identified by the model but to also investigate the second and third most favorable sites.

The identified locations should be visited. You need to validate what you believe to be there is actually there. Things could have changed since the data for the model was created. For example, views may be one of the input criteria to the model; the better the view, the more preferred the location will be. From the input elevation data, the model identified the locations with the best views; however, when one of the favorable sites is visited, it is discovered that a building has been constructed in front of the location, obstructing the view.

Taking the input from all of the steps above, a location is selected.