Create Spatial Sampling Locations (Data Management)—ArcGIS Pro

Summary

Creates sample locations within a continuous study area using simple random, stratified, systematic (gridded), or cluster sampling designs.

Sampling is the process of selecting individuals from a population in order to study them and make inferences about the entire population. Continuous spatial sampling treats the population as a continuous area from which any location or area can be sampled. For example, you can use this tool to create sample locations for trees within a dense forest or to collect soil moisture measurements in a crop field. This tool is not appropriate for sampling discrete populations such as households, animals, or cities.

Illustration

Usage

The input study area must be a polygon feature class or an integer (categorical) raster. You can also draw the study area on a map using interactive feature input. For rasters, cells with null values will not be considered part of the study area.
Sample locations can be created for the following primary sampling designs:
- Simple random sampling—Creates sample points randomly within the study area. Each location in the study area is equally likely to be selected as a sample location. The study area will be treated as a single area, and all boundaries between polygons or raster categories will be ignored (for example, a polygon feature class of all counties within a state will define the same study area as a single polygon of the entire state). Simple random sampling is useful when you want to investigate the entire study area, but no location is more important for sampling than any other location. To perform simple random sampling, choose the Simple random option of the Sampling Method parameter.
  - Example application: If the study area is a dense forest where every location can be assumed to have a tree, simple random sampling can be used to randomly sample trees within the forest.
- Stratified random sampling—Creates sample points by dividing the study area into distinct strata (such as soil class or land use type) and performing simple random sampling separately within each stratum. Stratified random sampling is useful when you want to ensure that all strata are represented in the sample. To perform stratified random sampling, choose one of the three stratification options of the Sampling Method parameter (see the next usage tip for information about each type of stratification).
  - Example application: If a national park is divided into elevation classes, stratified random sampling can be used to collect soil samples separately for each elevation class. This ensures that there will be sufficient soil sampling across all elevations in the park.
- Systematic sampling—Creates sample locations in a gridded, nonrandom pattern within the study area. The grid is created by a tessellation of regularly shaped polygons (such as hexagons, squares, or triangles). The sample locations can be returned as the tessellated polygons or as points (the centroids of the tessellated polygons). Systematic sampling is useful for ensuring that no sections of the study area are sampled more than others, which is often desirable when the goal is to create a map of the samples rather than to make inferences about the entire study area. To perform systematic sampling, choose the Systematic option of the Sampling Method parameter.
  - Example application: To study the ocean floor in a marine area, you can create a hexagonal grid of sample locations to sample marine plant species.
- Cluster sampling—Creates sample polygons by creating a systematic sample and randomly selecting some of the polygons from the tessellation. The resulting polygons are called clusters, and typically these clusters are exhaustively studied, sampling as much as possible within each cluster. Cluster sampling is useful when you are most interested how samples interact with each other at short distances, and it is acceptable for large sections of the study area to have no samples. To perform cluster sampling, choose the Cluster option of the Sampling Method parameter.
  - Example application: When sampling insect colonies, cluster sampling can be used to create small areas of a plot, and all insect colonies within the clusters will be sampled.
For stratified sampling, you can define the strata in three ways. Each is available as an option of the Sampling Method parameter:
- Stratify by individual polygon—Each record in a polygon feature class is a different stratum. For example, if the study area is a field with subplots stored as separate polygons, sample points will be created separately for each subplot. The input study area must be polygons.
- Stratify by contiguous raster region—Each region of an integer (categorical) raster will be a stratum. A raster region is a contiguous block of cells with the same value (from the Value field) that are connected by shared cell edges. If two regions have the same value but are disconnected from each other, they will be different strata. The input study area must be a raster.
- Stratify by strata ID field—All polygons or raster cells with the same strata ID value will be a stratum. The polygons or raster cells do not have to be contiguous to be in the same stratum. Provide the field containing the strata ID values in the Strata ID Field parameter. The field must be integer or text.
You can specify the number of samples that will be created in each stratum using one of following options of the Strata Sample Count Allocation Method parameter:
- Equal count in each stratum—An equal number of samples will be created in each stratum. Provide the value in the Number of Sample Per Stratum parameter.
- Count proportional to stratum area—The number of samples in the strata will be proportional to the size of the strata. Provide the overall number of samples in the Number of Samples parameter, and the total count will be distributed to each stratum proportionally to its area.
- Count equal to population field—The number of samples in each stratum will be equal to the values of a population field. Provide the field in the Population Field parameter. The field cannot contain negative values and must be an integer type.
- Count proportional to population field—The number of samples in each stratum will be proportional to the values of a population field. Provide the field in the Population Field parameter and the overall number of samples in the Number of Samples parameter.
The tool can also be used to create more advanced sampling designs that are not available as explicit options of the Sampling Method parameter.
- Two-stage cluster sampling—Creates clusters of points throughout the study area by first creating a cluster sample and then creating points (simple random, stratified, or systematic) within each cluster. This sampling design is useful when a cluster sample is needed, but it is not feasible to exhaustively study each cluster polygon. It is also useful when you are primarily interested in how samples interact at short distances. To perform two-stage cluster sampling, first use the tool to create a cluster sample, then use the cluster polygons as the input study area in a simple random, stratified, or systematic sampling design.
- Mixed (composite) sampling—Separately creates sampling locations from different sampling designs and then merges them into a single dataset. For example, combining a simple random sample and a two-stage cluster sample will produce sampling locations across the entire study area (simple random) but also include small patches with more points (two-stage cluster). This is useful because simple random sampling on its own can miss how samples interact at short distances, but two-stage cluster sampling leaves large areas of the study area with no sample locations. By combining the two, you can ensure that the entire study area is represented and still investigate the interaction between samples at short distances.
A warning will be returned if the specified number of sample locations cannot be created. This can occur in the following situations:
- The value of the Minimum Distance Between Sample Points parameter is sufficiently large that the specified number of sample locations cannot be created within the study area (or stratum) without some points being closer to each other than the minimum distance. In this case, fewer locations will be created than were specified.
- If the Bin Size parameter value is provided as a count, it is not always possible to create the specified number of sample locations in the study area. The tool will try various area values and use the area that creates a sample count closest to the specified value. The area (in the unit of the output coordinate system) and the resulting number of sample locations will be returned as geoprocessing messages.
If the specified parameters do not create any sample locations (such as using an output extent that does not intersect the study area), an error will be returned.
For systematic and cluster sampling and any bin shape except H3 hexagons, the centroid of the first polygon of the tessellation is created at the lower left corner of the output extent. For H3 hexagons, the hexagons are at fixed locations. For all bin shapes, you can use the Spatial Relationship parameter to return the polygons that intersect, are completely within, or have centroids that are within the study area.
Learn more about H3 hexagons and resolutions
If you stratify by strata ID field and use a population field (equal or proportional), the population of each stratum will be the sum of the population field values of every polygon or raster category in the stratum.
If you stratify by contiguous raster region, you cannot use a population field. This is because each population field value represents the total population of a raster category even if the category is composed of multiple disjoint regions. To use population fields while stratifying by contiguous raster region, use the Raster To Polygon tool to convert the raster to polygons and assign population values to each polygon (for example, by allocating the population of each category proportionally to the number of cells in each of its regions).
For stratified sampling with strata sample count proportional to area or a population field, the Largest Remainder Method is used to ensure that the overall sample count is not altered due to rounding.

Parameters

Label	Explanation	Data Type
Input Study Area	The input study area where sample locations will be created. The study area must be polygons or an integer (categorical) raster. For rasters, cells with null values will not be included in the study area.	Feature Layer; Raster Layer
Output Features	The output features representing the sample locations. For simple random and stratified sampling, the output features will be points. For cluster sampling, the output will be polygons. For systematic sampling, the output can be points or polygons.	Feature Class
Sampling Method (Optional)	Specifies the sampling method that will be used to create the sample locations. Simple random—Points will be randomly created in the study area, and all locations have the same likelihood of being sampled. All boundaries between individual polygons or raster regions will be ignored. This is the default. Stratified by individual polygon—Each polygon will be a different stratum, and points will be randomly and independently created in each polygon. The input study area must be polygons. Stratified by contiguous raster region—Each region of a categorical raster will be a stratum, and sample points will be randomly and independently created in each region. A raster region is a contiguous block of cells with the same value that are connected by shared cell edges. If two regions have the same value but are not connected by shared edges, they will be different strata. The input study area must be a raster. Stratified by strata ID field—Each polygon or raster region with the same strata ID field value will be a stratum, and sample points will be randomly and independently created in each stratum. The polygons or raster cells are not required to be contiguous to be in the same stratum. Systematic—Sample locations will be created using a gridded tessellation in the study area. The sample locations can be created as polygons or as points (centroids of the tessellated polygons). Cluster—Sample polygons will be created by randomly selecting polygons from a tessellation of the study area.	String
Strata ID Field (Optional)	For stratified sampling by strata ID field, the strata ID field defining the strata.	Field
Strata Sample Count Allocation Method (Optional)	For stratified sampling, specifies the method that will be used to determine the number of sample locations that will be created in each stratum. Equal count in each stratum—The same number of sample locations will be created in each stratum. Provide the value in the Number of Samples Per Strata parameter. This is the default. Count proportional to stratum area—The number of sample locations in each stratum will be proportional to the area of the stratum. Provide the total number of samples in the Number of Samples parameter. Count equal to population field—The number of sample locations in each stratum will be equal to the values of a population field. Provide the field in the Population Field parameter. This option is not available when stratifying by contiguous raster region. Count proportional to population field—The number of sample locations in each stratum will be proportional to the values of a population field. Provide the field in the Population Field parameter and the total number of samples in the Number of Samples parameter. This option is not available when stratifying by contiguous raster region.	String
Bin Shape (Optional)	For systematic and cluster sampling, specifies the shape of each polygon in the gridded tessellation. Hexagon—Hexagon-shaped features will be generated. The top and bottom side of each hexagon will be parallel with the x-axis of the coordinate system (the top and bottom are flat). Transverse hexagon—Transverse hexagon-shaped features will be generated. The right and left side of each hexagon will be parallel with the y-axis of the dataset's coordinate system (the top and bottom are pointed). Square—Square-shaped features will be generated. The top and bottom side of each square will be parallel with the x-axis of the coordinate system, and the right and left sides will be parallel with the y-axis of the coordinate system. Diamond—Diamond-shaped features will be generated. The sides of each polygon will be rotated 45 degrees away from the x-axis and y-axis of the coordinate system. Triangle—Triangular-shaped features will be generated. Each triangle will be a regular three-sided equilateral polygon. H3 hexagon—Hexagon-shaped features will be generated based on the H3 Hexagonal hierarchical geospatial indexing system.	String
Bin Size [count or area] (Optional)	For systematic and cluster sampling, the size of each polygon in the tessellation. The value can be provided as a count (the total number of tessellated polygons created in the study area) or as an area (the area of each tessellated polygon). For count input, the default is 100. For area input, a value must be provided. If a count is provided, the tool will attempt to create the specified number of sample locations. If the exact number cannot be created, a warning will be returned.	Areal Unit; Long
H3 Resolution (Optional)	For systematic or cluster sampling with H3 hexagon bins, specifies the H3 resolution of the hexagons. With each increasing resolution value, the area of the polygons will be one seventh the size. 0—Hexagons will be created at the H3 resolution of 0, with an average area of 4,357,449.416078381 square kilometers. 1—Hexagons will be created at the H3 resolution of 1, with an average area of 609,788.441794133 square kilometers. 2—Hexagons will be created at the H3 resolution of 2, with an average area of 86,801.780398997 square kilometers. 3—Hexagons will be created at the H3 resolution of 3, with an average area of 12,393.434655088 square kilometers. 4—Hexagons will be created at the H3 resolution of 4, with an average area of 1,770.347654491 square kilometers. 5—Hexagons will be created at the H3 resolution of 5, with an average area of 252.903858182 square kilometers. 6—Hexagons will be created at the H3 resolution of 6, with an average area of 36.129062164 square kilometers. 7—Hexagons will be created at the H3 resolution of 7, with an average area of 5.161293360 square kilometers. This is the default. 8—Hexagons will be created at the H3 resolution of 8, with an average area of 0.737327598 square kilometers. 9—Hexagons will be created at the H3 resolution of 9, with an average area of 0.105332513 square kilometers. 10—Hexagons will be created at the H3 resolution of 10, with an average area of 0.015047502 square kilometers. 11—Hexagons will be created at the H3 resolution of 11, with an average area of 0.002149643 square kilometers. 12—Hexagons will be created at the H3 resolution of 12, with an average area of 0.000307092 square kilometers. 13—Hexagons will be created at the H3 resolution of 13, with an average area of 0.000043870 square kilometers. 14—Hexagons will be created at the H3 resolution of 14, with an average area of 0.000006267 square kilometers. 15—Hexagons will be created at the H3 resolution of 15, with an average area of 0.000000895 square kilometers.	Long
Number of Samples (Optional)	The number of sample locations that will be created. This parameter always applies to simple random and cluster sampling. For stratified sampling, this parameter applies when the sample count will be proportional to stratum area or proportional to a population field. For simple random and stratified sampling, the default is 100. For cluster sampling, the default is 10.	Long
Number of Samples Per Stratum (Optional)	For stratified sampling with equal sample count in each stratum, the number of sample locations created within each stratum. The total number of samples will be this value multiplied by the number of strata. The default is 100.	Long
Population Field (Optional)	The population field for stratified sampling when the sample count is equal or proportional to a population field.	Field
Output Geometry Type (Optional)	For systematic sampling, specifies whether the sample locations will be tessellated polygons or centroids (points) of the tessellated polygons. Point—Centroids of the tessellated polygons will be created as sample locations. This is the default. Polygon—Tessellated polygons will be created as sampling locations.	String
Minimum Distance Between Sample Points (Optional)	For simple random and stratified sampling, the smallest allowed distance between sample locations. For simple random sampling, all points will be at least this distance apart. For stratified sampling, points within the same stratum will be at least this distance apart, but points in neighboring strata may be closer than this distance. For large distances, fewer sample locations than were expected may be created to keep the locations sufficiently far apart. In this case, a warning message will be returned.	Linear Unit
Spatial Relationship (Optional)	Specifies which polygons from a background tessellation will be included as sampling locations. This parameter applies to cluster sampling and to systematic sampling when the output geometry type is polygon. Have their center in—The centroids of the polygons must be within the study area to be included. This is the default. Completely within—The polygons must be completely within the study area to be included. Intersect—The polygons must intersect the study area to be included.	String

arcpy.management.CreateSpatialSamplingLocations(in_study_area, out_features, {sampling_method}, {strata_id_field}, {strata_count_method}, {bin_shape}, {bin_size}, {h3_resolution}, {num_samples}, {num_samples_per_strata}, {population_field}, {geometry_type}, {min_distance}, {spatial_relationship})

Name	Explanation	Data Type
in_study_area	The input study area where sample locations will be created. The study area must be polygons or an integer (categorical) raster. For rasters, cells with null values will not be included in the study area.	Feature Layer; Raster Layer
out_features	The output features representing the sample locations. For simple random and stratified sampling, the output features will be points. For cluster sampling, the output will be polygons. For systematic sampling, the output can be points or polygons.	Feature Class
sampling_method (Optional)	Specifies the sampling method that will be used to create the sample locations. RANDOM—Points will be randomly created in the study area, and all locations have the same likelihood of being sampled. All boundaries between individual polygons or raster regions will be ignored. This is the default. STRAT_POLY—Each polygon will be a different stratum, and points will be randomly and independently created in each polygon. The input study area must be polygons. STRAT_RAST—Each region of a categorical raster will be a stratum, and sample points will be randomly and independently created in each region. A raster region is a contiguous block of cells with the same value that are connected by shared cell edges. If two regions have the same value but are not connected by shared edges, they will be different strata. The input study area must be a raster. STRAT_ID—Each polygon or raster region with the same strata ID field value will be a stratum, and sample points will be randomly and independently created in each stratum. The polygons or raster cells are not required to be contiguous to be in the same stratum. SYSTEMATIC—Sample locations will be created using a gridded tessellation in the study area. The sample locations can be created as polygons or as points (centroids of the tessellated polygons). CLUSTER—Sample polygons will be created by randomly selecting polygons from a tessellation of the study area.	String
strata_id_field (Optional)	For stratified sampling by strata ID field, the strata ID field defining the strata.	Field
strata_count_method (Optional)	For stratified sampling, specifies the method that will be used to determine the number of sample locations that will be created in each stratum. EQUAL—The same number of sample locations will be created in each stratum. Provide the value in the num_samples_per_strata parameter. This is the default. PROP_AREA—The number of sample locations in each stratum will be proportional to the area of the stratum. Provide the total number of samples in the num_samples parameter. FIELD—The number of sample locations in each stratum will be equal to the values of a population field. Provide the field in the population_field parameter. This option is not available when stratifying by contiguous raster region. PROP_FIELD—The number of sample locations in each stratum will be proportional to the values of a population field. Provide the field in the population_field parameter and the total number of samples in the num_samples parameter. This option is not available when stratifying by contiguous raster region.	String
bin_shape (Optional)	For systematic and cluster sampling, specifies the shape of each polygon in the gridded tessellation. HEXAGON—Hexagon-shaped features will be generated. The top and bottom side of each hexagon will be parallel with the x-axis of the coordinate system (the top and bottom are flat). TRANSVERSE_HEXAGON—Transverse hexagon-shaped features will be generated. The right and left side of each hexagon will be parallel with the y-axis of the dataset's coordinate system (the top and bottom are pointed). SQUARE—Square-shaped features will be generated. The top and bottom side of each square will be parallel with the x-axis of the coordinate system, and the right and left sides will be parallel with the y-axis of the coordinate system. DIAMOND—Diamond-shaped features will be generated. The sides of each polygon will be rotated 45 degrees away from the x-axis and y-axis of the coordinate system. TRIANGLE—Triangular-shaped features will be generated. Each triangle will be a regular three-sided equilateral polygon. H3_HEXAGON—Hexagon-shaped features will be generated based on the H3 Hexagonal hierarchical geospatial indexing system.	String
bin_size (Optional)	For systematic and cluster sampling, the size of each polygon in the tessellation. The value can be provided as a count (the total number of tessellated polygons created in the study area) or as an area (the area of each tessellated polygon). For count input, the default is 100. For area input, a value must be provided. If a count is provided, the tool will attempt to create the specified number of sample locations. If the exact number cannot be created, a warning will be returned.	Areal Unit; Long
h3_resolution (Optional)	For systematic or cluster sampling with H3 hexagon bins, specifies the H3 resolution of the hexagons. With each increasing resolution value, the area of the polygons will be one seventh the size. 0—Hexagons will be created at the H3 resolution of 0, with an average area of 4,357,449.416078381 square kilometers. 1—Hexagons will be created at the H3 resolution of 1, with an average area of 609,788.441794133 square kilometers. 2—Hexagons will be created at the H3 resolution of 2, with an average area of 86,801.780398997 square kilometers. 3—Hexagons will be created at the H3 resolution of 3, with an average area of 12,393.434655088 square kilometers. 4—Hexagons will be created at the H3 resolution of 4, with an average area of 1,770.347654491 square kilometers. 5—Hexagons will be created at the H3 resolution of 5, with an average area of 252.903858182 square kilometers. 6—Hexagons will be created at the H3 resolution of 6, with an average area of 36.129062164 square kilometers. 7—Hexagons will be created at the H3 resolution of 7, with an average area of 5.161293360 square kilometers. This is the default. 8—Hexagons will be created at the H3 resolution of 8, with an average area of 0.737327598 square kilometers. 9—Hexagons will be created at the H3 resolution of 9, with an average area of 0.105332513 square kilometers. 10—Hexagons will be created at the H3 resolution of 10, with an average area of 0.015047502 square kilometers. 11—Hexagons will be created at the H3 resolution of 11, with an average area of 0.002149643 square kilometers. 12—Hexagons will be created at the H3 resolution of 12, with an average area of 0.000307092 square kilometers. 13—Hexagons will be created at the H3 resolution of 13, with an average area of 0.000043870 square kilometers. 14—Hexagons will be created at the H3 resolution of 14, with an average area of 0.000006267 square kilometers. 15—Hexagons will be created at the H3 resolution of 15, with an average area of 0.000000895 square kilometers.	Long
num_samples (Optional)	The number of sample locations that will be created. This parameter always applies to simple random and cluster sampling. For stratified sampling, this parameter applies when the sample count will be proportional to stratum area or proportional to a population field. For simple random and stratified sampling, the default is 100. For cluster sampling, the default is 10.	Long
num_samples_per_strata (Optional)	For stratified sampling with equal sample count in each stratum, the number of sample locations created within each stratum. The total number of samples will be this value multiplied by the number of strata. The default is 100.	Long
population_field (Optional)	The population field for stratified sampling when the sample count is equal or proportional to a population field.	Field
geometry_type (Optional)	For systematic sampling, specifies whether the sample locations will be tessellated polygons or centroids (points) of the tessellated polygons. POINT—Centroids of the tessellated polygons will be created as sample locations. This is the default. POLYGON—Tessellated polygons will be created as sampling locations.	String
min_distance (Optional)	For simple random and stratified sampling, the smallest allowed distance between sample locations. For simple random sampling, all points will be at least this distance apart. For stratified sampling, points within the same stratum will be at least this distance apart, but points in neighboring strata may be closer than this distance. For large distances, fewer sample locations than were expected may be created to keep the locations sufficiently far apart. In this case, a warning message will be returned.	Linear Unit
spatial_relationship (Optional)	Specifies which polygons from a background tessellation will be included as sampling locations. This parameter applies to cluster sampling and to systematic sampling when the output geometry type is polygon. HAVE_THEIR_CENTER_IN—The centroids of the polygons must be within the study area to be included. This is the default. COMPLETELY_WITHIN—The polygons must be completely within the study area to be included. INTERSECT—The polygons must intersect the study area to be included.	String