Exploratory Interpolation (Geostatistical Analyst)

Summary

Generates various interpolation results from input point features and a field. The interpolation results are then compared and ranked using customizable criteria based on cross validation statistics.

Interpolation results can be ranked based on a single criterion (such as highest prediction accuracy or lowest bias), weighted average ranks of multiple criteria, or hierarchical sorting of multiple criteria (in which ties by each of the criteria are broken by subsequent criteria in the hierarchy). Exclusion criteria can also be used to exclude interpolation results from the comparison that do not meet minimal quality standards. The output is a table summarizing the cross validation statistics and ranks for each interpolation result. Optionally, you can output a geostatistical layer of the interpolation result with highest rank to be used in further workflows.

Illustration

Exploratory Interpolation tool illustration
Various interpolation methods are generated, compared, and ranked.

Usage

  • Cross validation is a leave-one-out method for evaluating interpolation results. The method sequentially removes each point in the dataset and uses all remaining points to predict the value of the excluded point. The cross validation prediction is then compared to the true value of the hidden point, and the difference between the two is the cross validation error (the error can be positive or negative). The reasoning behind cross validation is that if the interpolation result is effective at predicting the values of the hidden points, it should also be effective at predicting unknown values at new locations, which is the goal of interpolation. All criteria used by this tool are based on summary statistics of the cross validation results.

    While assessing interpolation results using cross validation summary statistics is a convenient and effective way to compare multiple interpolation results, it does not replace expert knowledge of the data and interactive investigation of the results. Reviewing charts and individual cross validation errors often reveals patterns in the results that are not obvious from the summary statistics. For example, there are often spatial patterns in the cross validation errors where some areas are underestimated and other areas are overestimated; patterns such as this may not be represented by summary statistics.

    Learn more about using cross validation to assess interpolation results

  • The Comparison method parameter has three options for comparing the cross validation statistics of the interpolation results. Each option has advantages and disadvantages:

    • Single criterion—A single criterion is used to compare and rank results. You can rank results by highest prediction accuracy, lowest bias, lowest worst-case error, highest standard error accuracy, or highest precision. The criterion is provided in the Criterion parameter.
      • Advantages—This option is a simple and common method for comparing interpolation results that are known to be stable and consistent. It is also useful for choosing between results that are all very similar.
      • Disadvantages—Interpolation results frequently perform well by some criteria but not others, for example, by having high prediction accuracy but also high bias. In this case, ranking by a single criterion will assign high ranks to results that are unstable or misleading. When ranking by a single criterion, it is recommended that you use various options of the Exclusion criteria parameter to ensure that unstable or misleading results are removed prior to the comparison.
    • Hierarchical sorting with tolerances—Hierarchical sorting is used to compare and rank results. Multiple criteria are specified in priority order (highest priority first) in the Criteria hierarchy parameter. The interpolation results are ranked by the first criterion, and any ties are broken by the second criterion. Ties in the second criterion are broken by the third criterion, and so on. This process is modeled after Custom Sort and hierarchical sorting in spreadsheet software (sort by A, then by B, then by C, and so on). However, cross validation statistics are continuous values and generally do not have exact ties, so tolerances (percent or absolute) can be specified to create ties in each of the criteria.
      • Advantages—This option uses multiple criteria, and it takes into account the relative differences of the cross validation statistics. For example, if one interpolation result is much better than the rest by the highest priority criterion, that interpolation result will receive the highest rank regardless of the subsequent criteria in the hierarchy.
      • Disadvantages—The effectiveness of hierarchical sorting depends on the provided tolerance values. If tolerances are too small, some criteria may not be used because there are no ties to break. If tolerances are too large, there may be many ties in the rankings due to many results being within the tolerances of each other.
    • Weighted average rank—The weighted average rank of multiple criteria are used to compare and rank results. Multiple criteria and associated weights are specified in the Weighted criteria parameter. The interpolation results are ranked independently by each of the criteria, and a weighted average of the ranks is used to determine the final ranks. Criteria with larger weights will have more influence on the final ranks, so they can be used to indicate preference for certain criteria over others.
      • Advantages—This option uses multiple criteria, allows for preferences of some criteria over others, and always uses all criteria in the comparison.
      • Disadvantages—The relative differences in the values of the cross validation statistics are ignored. For example, all root mean square error values may be within a very small tolerance of each other (indicating that all results have approximately equal prediction accuracy), but they will still be ranked 1 through N by prediction accuracy (for N interpolation results). However, the mean error values may vary by large amounts between the results (indicating the results have large differences in their biases), but they will also be ranked 1 through N by the bias criterion. The weighted average uses only the ranks of the criteria, so the relative differences in the cross validation statistics are ignored in the ranking.

  • The output is a table summarizing the cross validation statistics, descriptions of the interpolation results, and rankings and can be included in a presentation or report. Cross validation statistics will only be included in the table if they apply to at least one interpolation result. For example, if only inverse distance weighting and radial basis functions are used, the output table will not contain a field of average standard error values because these methods do not calculate standard errors. If a statistic applies to some interpolation results but not others, the value will be null for results to which the statistic does not apply. Additionally, if Empirical Bayesian Kriging is chosen for the Interpolation methods parameter, several cross validation statistics will be included in the table that are not used by any criteria in this tool; these are included for informational purposes and will have null values for all other interpolation methods. If weighted average rank is used, the ranks for all criteria and their weighted average will also be included in the table.

    Optionally, you can use the Output geostatistical layer with highest rank parameter to create a geostatistical layer of the interpolation result with the highest rank. This allows you to map the best interpolation result and use it in other workflows.

  • While the tool is running, geoprocessing messages and progress bar messages display the current interpolation result being calculated. After all results are calculated and compared, the ranks are printed as geoprocessing messages. The ranks are also available in the output cross validation table.

  • The Compare Geostatistical Layers tool performs the same cross validation comparisons as this tool, but it performs the comparisons on previously created interpolation results (geostatistical layers).

  • The following table lists the available criteria, the cross validation statistics that measures them, and the formulas used to assign a score to each interpolation result (smaller scores are better). Ranks for the criteria are determined by sorting the scores of each interpolation result.

    Note:

    For three of the criteria, the score is equal to the cross validation statistic.

    CriteriaCross validation statisticScore formula

    Highest prediction accuracy

    Root mean square error

    Results are ranked by smallest root mean square error.

    Score = RootMeanSquareError

    Lowest bias

    Mean error

    Results are ranked by mean error closest to zero.

    Score = AbsoluteValue( MeanError )

    Lowest worst-case error

    Maximum absolute error

    Results are ranked by smallest maximum absolute error.

    Score = MaximumAbsoluteError

    Highest standard error accuracy

    Root mean square standardized error

    Results are ranked by root mean square standardized error closest to one.

    Score = AbsoluteValue( RMSStdError - 1 )

    Highest precision

    Average standard error

    Results are ranked by smallest average standard error.

    Score = AverageStandardError

  • If there are ties in any criteria, all tying results receive the same rank, equal to the highest of the ranks shared between them (where a higher rank means a smaller rank number). For example, ordered from best to worst, the root mean square error values (12, 14, 14, 15, 16, 16, 18) will receive ranks (1, 2, 2, 4, 5, 5, 7) by the prediction accuracy criterion. Ranks 3 and 6 are skipped due to the tying values.

    Ties can occur at various stages of the comparisons. Ties are most common when using hierarchical sorting because all results within the tolerance are considered ties to each other, and all results outside the tolerance are also considered ties to each other. Ties are also common in weighted average rank when the interpolation results have varying ranks by different criteria, which can result in equal weighted averages of the ranks. While uncommon, ties can also occur in single criteria comparisons (for example, if all points have a constant value). Ties by single criteria will also affect weighted average rank if the criteria are used in the weighted average.

  • In hierarchical sorting, provide the tolerances relative to the score of the criterion rather than the cross validation statistic. For the criteria where the score is equal to the statistic (highest prediction accuracy, lowest worst-case error, and highest precision), appropriate tolerance values are usually clear. For example, if the lowest root mean square error value of the interpolation results is 200, then a 10 percent tolerance will include all results with root mean square error values less than or equal to 220: 200 + (10/100) x 200 = 220. Similarly, an absolute tolerance of 15 will include all results with root mean square error values less than or equal to 215: 200 + 15 = 215.

    However, for the criteria where the score is not equal to the value of the statistic (lowest bias and highest standard error accuracy), appropriate tolerance values are less clear. For the mean error statistic, bias is scored by the absolute value of the mean error. This means, for example, that mean error values -4 and 6 have a relative difference of 50 percent because they are 50 percent different in absolute value: ABS(-4) + (50/100) x ABS(-4) = ABS(6). Similarly, their absolute difference is 2: ABS(-4) + 2 = ABS(6).

    For the root mean square standardized error statistic, standard error accuracy is scored by the absolute difference between the root mean square standardized error value and the ideal value of 1. This means, for example, that root mean square standardized error values 0.2 and 2.4 have a 75 percent relative difference. To understand why, comparing the values 0.2 and 2.4, the latter is 1.75 times farther away (a 75 percent increase) from the ideal value of 1 than the former (absolute differences of 0.8 and 1.4, respectively): ABS(0.2 - 1) + (75/100) x ABS(0.2 - 1) = ABS(2.4 - 1). Similarly, their absolute difference is 0.6: ABS(0.2 - 1) + 0.6 = ABS(2.4 - 1).

  • Various criteria require all interpolation results to support the standard error output type. By default, the Interpolation methods parameter options allow all options for all parameters. However, if the Inverse Distance Weighting, Radial Basis Functions, or Global Polynomial Interpolation option is specified, various options of several parameters will become unavailable because these methods cannot calculate standard errors of predictions. The options that are unavailable are related to standard error accuracy, precision, the root mean square standardized error statistic, or the average standard error statistic.

    Learn more about which interpolation methods can calculate standard errors of predictions

  • The Minimum percent error reduction option of the Exclusion criteria parameter is particularly useful when you do not know the values or range of the points being interpolated (for example, in an automated environment). This option excludes interpolation results that are not sufficiently more accurate than a baseline nonspatial model that predicts the global average value at all locations in the map. This relative accuracy is measured by comparing the root mean square error value to the standard deviation of the values of the points being interpolated, and the root mean square error must be at least the specified percent less than the standard deviation to be included in the comparison. For example, a value of 10 means that the root mean square error must be at least 10 percent lower than the standard deviation to be included in the comparison and ranking.

    Different disciplines have different standards for acceptable error reductions in interpolation results. In physical sciences with measurements that are densely sampled, errors often reduce by more than 90 percent. In social sciences, however, error reductions of only 10 to 20 percent are often significant to researchers.

  • Each Interpolation methods parameter option generates between 1 and 5 interpolation results. By default, 11 results are generated. If all options are chosen, 20 results will be generated. The following table shows the 20 possible values of the Description field of the Output cross validation table value along with details about the result. To further investigate any of the results, the third column provides steps to create a geostatistical layer of the result using the Geostatistical Wizard.

    Note:

    In the instructions for creating the interpolation result, it is assumed that you have opened the Geostatistical Wizard, chosen the interpolation method in the pane on the left, and provided the points and field in the pane on the right. For simple, ordinary, and universal kriging, the kriging type is specified on the second page of the wizard; on the first page, use Kriging/CoKriging for all three types. If the instructions start on a particular page of the wizard, click Next to get to that page without changing any parameters. At the end of the instructions, click Finish and click OK to add the interpolation result to the map.

    Field valueDescriptionCreation

    Simple Kriging - Default

    A simple kriging model with default parameters. By default, simple kriging uses a transformation.

    No changes needed.

    Simple Kriging - Optimized

    A simple kriging model with optimized parameters.

    On the semivariogram page (page 4), click the Optimize model button Optimize.

    Simple Kriging - Trend

    A simple kriging model with trend removal and no transformation.

    On the second page, change Transformation type to None, and change Order of trend removal to First.

    Simple Kriging - Trend and transformation

    A simple kriging model with trend removal and a transformation.

    On the second page, change Order of trend removal to First.

    Ordinary Kriging - Default

    An ordinary kriging model with default parameters.

    No changes needed.

    Ordinary Kriging - Optimized

    An ordinary kriging model with optimized parameters.

    On the semivariogram page (page 4), click the Optimize model button Optimize.

    Universal Kriging - Default

    A universal kriging model with first order trend removal and default parameters.

    On the second page, change Order of trend removal to First.

    Universal Kriging - Optimized

    A universal kriging model with first order trend removal and optimized parameters.

    On the second page, change Order of trend removal to First. On the semivariogram page (page 4), click the Optimize model button Optimize.

    Empirical Bayesian Kriging - Default

    An empirical Bayesian kriging model with default parameters.

    No changes needed.

    Empirical Bayesian Kriging - Advanced

    An advanced empirical Bayesian kriging model using larger subsets, detrending, and more overlapping and simulations.

    On the second page, change the following parameters to the values shown:

    • Subset Size—200
    • Overlap Factor—300
    • Number of Simulations—2
    • Transformation—Empirical
    • Semivariogram Type—K-Bessel Detrended

    Kernel (Local Polynomial) Interpolation

    A kernel (local polynomial) interpolation model with default parameters.

    No changes needed. Both Kernel Interpolation and Local Polynomial Interpolation are available on the first page. These methods use kernel interpolation, but you should expect similar results from local polynomial interpolation because the methods are similar.

    Inverse Distance Weighting - Default

    An inverse distance weighting model with a power value equal to 2 (default).

    No changes needed.

    Inverse Distance Weighting - Optimized

    An inverse distance weighting model with an optimized power value.

    On the second page, click the Optimize button Optimize that appears next to the Power parameter.

    Radial Basis Functions - Completely regularized spline

    A radial basis functions model using a completely regularized spline kernel function.

    No changes needed. Completely regularized spline is the default kernel function for radial basis functions.

    Radial Basis Functions - Spline with tension

    A radial basis functions model using a spline with tension kernel function.

    On the second page, change the Kernel Function parameter to Spline with tension.

    Radial Basis Functions - Multiquadric

    A radial basis functions model using a multiquadric kernel function.

    On the second page, change the Kernel Function parameter to Multiquadric.

    Radial Basis Functions - Inverse multiquadric

    A radial basis functions model using an inverse multiquadric kernel function.

    On the second page, change the Kernel Function parameter to Inverse multiquadric.

    Radial Basis Functions - Thin plate spline

    A radial basis functions model using a thin plate spline kernel function.

    On the second page, change the Kernel Function parameter to Thin plate spline.

    Global Polynomial Interpolation - Second order

    A global polynomial interpolation model with second order (quadratic) trend.

    On the second page, change the Order of polynomial parameter to 2.

    Global Polynomial Interpolation - Third order

    A global polynomial interpolation model with third order (cubic) trend.

    On the second page, change the Order of polynomial parameter to 3.

Parameters

LabelExplanationData Type
Input features

The input points representing locations of points to be interpolated.

Feature Layer
Value field

The field containing the values to be interpolated.

Field
Output cross validation table

The output table containing cross validation statistics and ranks for each interpolation result. The final ranks of the interpolation results are stored in the RANK field.

Table
Output geostatistical layer with highest rank
(Optional)

The output geostatistical layer of the interpolation result with highest rank. This interpolation result will have the value 1 in the RANK field of the output cross validation table. If there are ties for the interpolation result with highest rank or all results are excluded by exclusion criteria, the layer will not be created even if a value is provided. Warning messages will be returned by the tool if this occurs.

Geostatistical Layer
Interpolation methods
(Optional)

Specifies the interpolation methods that will be performed on the input features and value field. For each method specified, 1 to 5 interpolation results will be generated. By default, all methods will be generated except inverse distance weighting, radial basis functions, and global polynomial (because these methods cannot create standard errors of predictions). By default, 11 interpolation results will be generated. If all options are specified, 20 interpolation results will be generated.

  • Simple KrigingFour simple kriging results will be generated: default, optimized, trend removal, and transformation with trend removal.
  • Ordinary KrigingTwo ordinary kriging results will be generated: default and optimized.
  • Universal KrigingTwo universal kriging results will be generated: default and optimized.
  • Empirical Bayesian KrigingTwo empirical Bayesian kriging results will be generated: default and advanced.
  • Kernel (Local Polynomial) InterpolationOne default kernel (local polynomial) interpolation result will be generated.
  • Inverse Distance WeightingTwo inverse distance weighting results will be generated: default and optimized.
  • Radial Basis FunctionsFive radial basis functions results will be generated, one for each of the five kernel functions.
  • Global Polynomial InterpolationTwo global polynomial interpolation results will be generated: linear (first order) and quadratic (second order) trend.
String
Comparison method
(Optional)

Specifies the method that will be used to compare and rank the interpolation results.

  • Single criterionA single criterion will be used to compare and rank results, such as highest prediction accuracy or lowest bias. The criterion from the Criterion parameter is used.
  • Hierarchical sorting with tolerancesHierarchical sorting will be used to compare results. Multiple criteria are specified in priority order (highest priority first) in the Criteria hierarchy parameter. The interpolation results are ranked by the first criterion, and any ties are broken by the second criterion. Ties in the second criterion are broken by the third criterion, and so on. Cross validation statistics are continuous values and generally do not have exact ties, so tolerances (percent or absolute) can be specified to create ties in each of the criteria.
  • Weighted average rankThe weighted average rank of multiple criteria will be used to compare results. Multiple criteria and associated weights are specified using the Weighted criteria parameter. The interpolation results are ranked independently by each of the criteria, and a weighted average of the ranks is used to determine the final ranks. Criteria with larger weights will have more influence on the final ranks, so weights can be used to indicate preference for certain criteria over others.
String
Criterion
(Optional)

Specifies the criterion that will be used to rank the interpolation results.

  • Highest prediction accuracyResults will be ranked by lowest root mean square error. This option measures how closely the cross validation predictions match the true values, on average. This is the default.
  • Lowest biasResults will be ranked by mean error closest to zero. This option measures how much the cross validation predictions overpredict or underpredict the true values, on average. Interpolation results with positive mean errors systematically overpredict the true values (positive bias), and results with negative mean errors systematically underpredict the true values (negative bias).
  • Lowest worst-case errorResults will be ranked by lowest maximum absolute error. This option measures only the single least accurate cross validation prediction (positive or negative). This is useful when you are most concerned about worst-case scenarios rather than the accuracy in typical conditions.
  • Highest standard error accuracy Results will be ranked by root mean square standardized error closest to one. This option measures how closely the variability of the cross validation predictions match the estimated standard errors. This is useful if you intend to create confidence intervals or margins of error for the predictions.
  • Highest precisionResults will be ranked by lowest average standard error. When creating confidence intervals or margins of error for the predicted values, results with higher precision will have narrower intervals around the predictions. It does not measure whether the standard errors are estimated accurately, only that the standard errors are small. When using this option, it is recommended that you include minimum and maximum root mean square standardized error values as exclusion criteria to ensure that the standard errors are both accurate and precise.
String
Criteria hierarchy
(Optional)

The hierarchy of criteria that will be used for hierarchical sorting with tolerances. Provide multiple criteria in priority order with the first being most important. The interpolation results are ranked by the first criterion, and any ties are broken by the second criterion. Ties in the second criterion are broken by the third criterion, and so on. Cross validation statistics are continuous values and generally do not have exact ties, so tolerances are used to induce ties in the criteria. For each row, specify a criterion in the first column, a tolerance type (percent or absolute) in the second column, and a tolerance value in the third column. If no tolerance value is provided, no tolerance will be used; this is most useful for the final row so that there will be no ties for the interpolation result with highest rank.

For each row (level of the hierarchy), the following criteria are available:

  • Root mean square error (Accuracy)—Results will be ranked by highest accuracy.
  • Mean error (Bias)—Results will be ranked by lowest bias.
  • Maximum absolute error (Worst-case error)—Results will be ranked by lowest worst-case error.
  • Standardized RMSE (Standard error accuracy)—Results will be ranked by highest standard error accuracy.
  • Average standard error (Precision)—Results will be ranked by highest precision.

For example, you can specify a Root mean square error (Accuracy) value with a 5 percent tolerance in the first row and a Mean error (Bias) value with no tolerance in the second row. These options will first rank the interpolation results by lowest root mean square error (highest prediction accuracy), and all interpolation results whose root mean square error values are within 5 percent of the most accurate result will be considered ties by prediction accuracy. Among the tying results, the result with a mean error closest to zero (lowest bias) will receive the highest rank.

Value Table
Weighted criteria
(Optional)

The multiple criteria with weights that will be used to rank interpolation results. For each row, provide a criterion and a weight. The interpolation results will be ranked independently by each of the criteria, and a weighted average of the ranks will be used to determine the final ranks of the interpolation results.

  • Highest prediction accuracy—Results will be ranked by lowest root mean square error.
  • Lowest bias—Results will be ranked by mean error closest to zero.
  • Lowest worst-case error—Results will be ranked by lowest maximum absolute error.
  • Highest standard error accuracy—Results will be ranked by root mean square standardized error closest to one.
  • Highest precision—Results will be ranked by lowest average standard error.

Value Table
Exclusion criteria
(Optional)

The criteria and associated values that will be used to exclude interpolation results from the comparison. Excluded results will not receive ranks and will have the value No in the Included field of the output cross validation table.

  • Maximum root mean square error—Results will be excluded if the root mean square error exceeds the specified value. The value cannot be negative. This option measures prediction accuracy.
  • Maximum absolute error—Results will be excluded if the maximum absolute error exceeds the specified value. The value cannot be negative. This option measures the worst-case error.
  • Maximum root mean square standardized error—Results will be excluded if the root mean square standard error exceeds the specified value. The value must be greater than or equal to 1. This option measures standard error accuracy.
  • Minimum root mean square standardized error—Results will be excluded if the root mean square standardized error does not exceed the specified value. The value must be between 0 and 1. This option measures standard error accuracy.
  • Maximum mean error—Results will be excluded if the mean error exceeds the specified value. The value cannot be negative. This option measures bias.
  • Minimum mean error—Results will be excluded if the mean error does not exceed the specified value. The value cannot be positive. This option measures bias.
  • Maximum average standard error—Results will be excluded if the average standard error exceeds the specified value. The value cannot be negative. This option measures precision.
  • Minimum percent error reduction—Results will be excluded if the interpolation result is not sufficiently more accurate than a baseline nonspatial model that predicts the global average value at all locations in the map. This relative accuracy is measured by comparing the root mean square error value to the standard deviation of the values of the points being interpolated, and the root mean square error must be at least the specified percent less than the standard deviation to be included in the comparison. For example, a value of 10 means that the root mean square error must be at least 10 percent lower than the standard deviation to be included in the comparison and ranking. The value must be between 0 and 100. This option measures prediction accuracy.

Value Table

arcpy.ga.ExploratoryInterpolation(in_features, value_field, out_cv_table, {out_geostat_layer}, {interp_methods}, {comparison_method}, {criterion}, {criteria_hierarchy}, {weighted_criteria}, {exclusion_criteria})
NameExplanationData Type
in_features

The input points representing locations of points to be interpolated.

Feature Layer
value_field

The field containing the values to be interpolated.

Field
out_cv_table

The output table containing cross validation statistics and ranks for each interpolation result. The final ranks of the interpolation results are stored in the RANK field.

Table
out_geostat_layer
(Optional)

The output geostatistical layer of the interpolation result with highest rank. This interpolation result will have the value 1 in the RANK field of the output cross validation table. If there are ties for the interpolation result with highest rank or all results are excluded by exclusion criteria, the layer will not be created even if a value is provided. Warning messages will be returned by the tool if this occurs.

Geostatistical Layer
interp_methods
[interp_methods,...]
(Optional)

Specifies the interpolation methods that will be performed on the input features and value field. For each method specified, 1 to 5 interpolation results will be generated. By default, all methods will be generated except inverse distance weighting, radial basis functions, and global polynomial (because these methods cannot create standard errors of predictions). By default, 11 interpolation results will be generated. If all options are specified, 20 interpolation results will be generated.

  • SIMPLE_KRIGINGFour simple kriging results will be generated: default, optimized, trend removal, and transformation with trend removal.
  • ORDINARY_KRIGINGTwo ordinary kriging results will be generated: default and optimized.
  • UNIVERSAL_KRIGINGTwo universal kriging results will be generated: default and optimized.
  • EBKTwo empirical Bayesian kriging results will be generated: default and advanced.
  • KERNEL_INTERPOLATIONOne default kernel (local polynomial) interpolation result will be generated.
  • IDWTwo inverse distance weighting results will be generated: default and optimized.
  • RBFFive radial basis functions results will be generated, one for each of the five kernel functions.
  • GPITwo global polynomial interpolation results will be generated: linear (first order) and quadratic (second order) trend.
String
comparison_method
(Optional)

Specifies the method that will be used to compare and rank the interpolation results.

  • SINGLEA single cross validation statistic will be used to compare and rank results, such as highest prediction accuracy or lowest bias. The criterion from the criterion parameter is used.
  • SORTINGHierarchical sorting will be used to compare results. Multiple criteria are specified in priority order (highest priority first) in the criteria_hierarchy parameter. The interpolation results are ranked by the first criterion, and any ties are broken by the second criterion. Ties in the second criterion are broken by the third criterion, and so on. Cross validation statistics are continuous values and generally do not have exact ties, so tolerances (percent or absolute) can be specified to create ties in each of the criteria.
  • AVERAGE_RANKThe weighted average rank of multiple criteria will be used to compare results. Multiple criteria and associated weights are specified in the weighted_criteria parameter. The interpolation results are ranked independently by each of the criteria, and a weighted average of the ranks is used to determine the final ranks. Criteria with larger weights will have more influence on the final ranks, so weights can be used to indicate preference for certain criteria over others.
String
criterion
(Optional)

Specifies the criterion that will be used to rank the interpolation results.

  • ACCURACYResults will be ranked by lowest root mean square error. This option measures how closely the cross validation predictions match the true values, on average. This is the default.
  • BIASResults will be ranked by mean error closest to zero. This option measures how much the cross validation predictions overpredict or underpredict the true values, on average. Interpolation results with positive mean errors systematically overpredict the true values (positive bias), and results with negative mean errors systematically underpredict the true values (negative bias).
  • WORST_CASEResults will be ranked by lowest maximum absolute error. This option measures only the single least accurate cross validation prediction (positive or negative). This is useful when you are most concerned about worst-case scenarios rather than the accuracy in typical conditions.
  • STANDARD_ERROR Results will be ranked by root mean square standardized error closest to one. This option measures how closely the variability of the cross validation predictions match the estimated standard errors. This is useful if you intend to create confidence intervals or margins of error for the predictions.
  • PRECISIONResults will be ranked by lowest average standard error. When creating confidence intervals or margins of error for the predicted values, results with higher precision will have narrower intervals around the predictions. It does not measure whether the standard errors are estimated accurately, only that the standard errors are small. When using this option, it is recommended that you include minimum and maximum root mean square standardized error values as exclusion criteria to ensure that the standard errors are both accurate and precise.
String
criteria_hierarchy
[[criteria1, tol_type1, tol_val1], [criteria2, tol_type2, tol_val2],...]
(Optional)

The hierarchy of criteria that will be used for hierarchical sorting with tolerances. Provide multiple criteria in priority order with the first being most important. The interpolation results are ranked by the first criterion, and any ties are broken by the second criterion. Ties in the second criterion are broken by the third criterion, and so on. Cross validation statistics are continuous values and generally do not have exact ties, so tolerances are used to induce ties in the criteria. For each row, specify a criterion in the first column, a tolerance type (percent or absolute) in the second column, and a tolerance value in the third column. If no tolerance value is provided, no tolerance will be used; this is most useful for the final row so that there will be no ties for the interpolation result with highest rank.

For each row (level of the hierarchy), the following criteria are available:

  • ACCURACY—Results will be ranked by highest accuracy.
  • BIAS—Results will be ranked by lowest bias.
  • WORST_CASE—Results will be ranked by lowest worst-case error.
  • STANDARD_ERROR—Results will be ranked by highest standard error accuracy.
  • PRECISION—Results will be ranked by highest precision.

For example, you can specify an ACCURACY value with a 5 percent tolerance in the first row and a BIAS value with no tolerance in the second row. These options will first rank the interpolation results by lowest root mean square error (highest prediction accuracy), and all interpolation results whose root mean square error values are within 5 percent of the most accurate result will be considered ties by prediction accuracy. Among the tying results, the result with a mean error closest to zero (lowest bias) will receive the highest rank.

Value Table
weighted_criteria
[[criteria1, weight1], [criteria2, weight2],...]
(Optional)

The multiple criteria with weights that will be used to rank interpolation results. For each row, provide a criterion and a weight. The interpolation results will be ranked independently by each of the criteria, and a weighted average of the ranks will be used to determine the final ranks of the interpolation results.

  • ACCURACY—Results will be ranked by lowest root mean square error.
  • BIAS—Results will be ranked by mean error closest to zero.
  • WORST_CASE—Results will be ranked by lowest maximum absolute error.
  • STANDARD_ERROR—Results will be ranked by root mean square standardized error closest to one.
  • PRECISION—Results will be ranked by lowest average standard error.

Value Table
exclusion_criteria
[[criteria1, value1], [criteria2, value2],...]
(Optional)

The criteria and associated values that will be used to exclude interpolation results from the comparison. Excluded results will not receive ranks and will have the value No in the Included field of the output cross validation table.

  • MAX_RMSE—Results will be excluded if the root mean square error exceeds the specified value. The value cannot be negative. This option measures prediction accuracy.
  • MAX_WORST_CASE—Results will be excluded if the maximum absolute error exceeds the specified value. The value cannot be negative. This option measures the worst-case error.
  • MAX_STD_RMSE—Results will be excluded if the root mean square standard error exceeds the specified value. The value must be greater than or equal to 1. This option measures standard error accuracy.
  • MIN_STD_RMSE—Results will be excluded if the root mean square standardized error does not exceed the specified value. The value must be between 0 and 1. This option measures standard error accuracy.
  • MAX_MEAN_ERROR—Results will be excluded if the mean error exceeds the specified value. The value cannot be negative. This option measures bias.
  • MIN_MEAN_ERROR—Results will be excluded if the mean error does not exceed the specified value. The value cannot be positive. This option measures bias.
  • MAX_ASE—Results will be excluded if the average standard square error exceeds the specified value. The value cannot be negative. This option measures precision.
  • MIN_PERC_ERROR—Results will be excluded if the interpolation result is not sufficiently more accurate than a baseline nonspatial model that predicts the global average value at all locations in the map. This relative accuracy is measured by comparing the root mean square error value to the standard deviation of the values of the points being interpolated, and the root mean square error must be at least the specified percent less than the standard deviation to be included in the comparison. For example, a value of 10 means that the root mean square error must be at least 10 percent lower than the standard deviation to be included in the comparison and ranking. The value must be between 0 and 100. This option measures prediction accuracy.

Value Table

Code sample

ExploratoryInterpolation example 1 (Python window)

The following Python script demonstrates how to use the ExploratoryInterpolation function.

# Interpolate points using Simple Kriging, Universal Kriging, and EBK
# Rank results by highest prediction accuracy
# Exclude results with error reductions under 25%

inPoints = "myPoints"
inField = "myField"
outTable = "outCVtable"
outGALayer = "Result With Highest Rank"
interpMethods = ["SIMPLE_KRIGING", "UNIVERSAL_KRIGING", "EBK"]
compMethod = "SINGLE"
criterion = "ACCURACY"
exclCrit = [["MIN_PERC_ERROR", 25]]
arcpy.ga.ExploratoryInterpolation(inPoints, inField, outTable, outGALayer,
         interpMethods, compMethod, criterion, None, None, exclCrit)
ExploratoryInterpolation example 2 (stand-alone script)

The following Python script demonstrates how to use the ExploratoryInterpolation function.

# Interpolate points and a field using various interpolation methods
# Rank results by highest weighted average rank
# Rank same results by hierarchical sorting

# Import system modules
import arcpy

# Check out the ArcGIS Geostatistical Analyst extension license
arcpy.CheckOutExtension("GeoStats")

# Allow overwriting output
arcpy.env.overwriteOutput = True

### Set shared parameters
# Set input and output locations
directory = "C:/data/"
ingdb = directory + "data.gdb/"
outgdb = directory + "out.gdb/"
arcpy.env.workspace = directory
# Input points
inPoints = ingdb + "myPoints"
# Input field
inField = "myField"
# List of interpolation methods
interpMethods = ["SIMPLE_KRIGING", "UNIVERSAL_KRIGING", "EBK"]
# Exclude results with error reductions under 25%
exclCrit = [["MIN_PERC_ERROR", 25]]
# Output geostatistical layer with highest rank
outGALayer = "Result With Highest Rank"

### Set weighted average rank parameters
# Output table of ranks and cross validation results
outTable = directory + "outWeightedAverageTable"
# Use weighted average rank
compMethod = "AVERAGE_RANK"
# Use all criteria with highest weight to prediction accuracy
weightedCrit = [
            ["ACCURACY", 3],
            ["BIAS", 1],
            ["WORST_CASE", 1],
            ["STANDARD_ERROR", 1],
            ["PRECISION", 1]
               ]

# Compare using weighted average rank
arcpy.ga.ExploratoryInterpolation(inPoints, inField, outTable, outGALayer,
         interpMethods, compMethod, None, None, weightedCrit, exclCrit)



### Set hierarchical sorting parameters
# Output table of ranks and cross validation results
outTable = directory + "outHierSortTable"
# Use hierarchical sorting with tolerances
compMethod = "SORTING"
# Compare using highest prediction accuracy with a 10% tolerance
# Break ties by lowest bias
hierCrit = [
            ["ACCURACY", "PERCENT", 10],
            ["BIAS", "PERCENT", None]
           ]

# Compare using hierarchical sorting with tolerances
arcpy.ga.ExploratoryInterpolation(inPoints, inField, outTable, outGALayer,
         interpMethods, compMethod, None, hierCrit, None, exclCrit)

Licensing information

  • Basic: Requires Geostatistical Analyst
  • Standard: Requires Geostatistical Analyst
  • Advanced: Requires Geostatistical Analyst

Related topics