How different input data formats are handled in Geostatistical Analyst

Available with Geostatistical Analyst license.

The input for interpolation methods in Geostatistical Analyst usually comes from point features with an associated field to be interpolated. However, various input data formats and options are supported, as described in the following sections.

Vector data

Geostatistical Analyst interpolation methods can accept points and polygons as input data.

  • Point data is read as x,y coordinates and an attribute value at each location. In the case where two or more points have the same x,y coordinates, the attribute values can be treated in different ways. The Geostatistical Wizard will prompt you to choose an option. For geoprocessing tools, the option must be set using the coincident points environment variable. See How coincident data is handled for more information.
  • Polygon data is used by computing x,y coordinates of each polygon centroid and reading in the data as points. Polygons are also not altered when they are used for polygon declustering (see Adjusting for preferential sampling by declustering the data), in conjunction with some of the kriging methods, or when polygons are used to represent barriers.

Missing values

Feature datasets with missing values should be treated with care. If missing values are represented by a code (for example, -99), all the statistical analysis based on that data will be wrong unless one of two options is followed:

  • Import the data into a geodatabase and recode all the missing values as <Null> values. Null values will be ignored in all the computations done within Geostatistical Analyst.
  • Write a definition query to exclude the missing values from the dataset. Note that definition queries will exclude entire rows from the attribute table, so you may need copies of the layer if you want to exclude different rows based on different attributes. For example, in the case where two attributes were supposed to be measured at each point, but for a few points, one measurement is missing, one layer would have a definition query to exclude missing values of attribute one, and a copy of the layer would be created using a definition query to exclude missing values for attribute two. For more information on using definition queries, refer to Build an SQL query.