ラベル | 説明 | データ タイプ |
Input features
| The input points representing locations of points to be interpolated. | Feature Layer |
Value field
| The field containing the values to be interpolated. | Field |
Output cross validation table
| The output table containing cross validation statistics and ranks for each interpolation result. The final ranks of the interpolation results are stored in the RANK field. | Table |
Output geostatistical layer with highest rank
(オプション) | The output geostatistical layer of the interpolation result with highest rank. This interpolation result will have the value 1 in the RANK field of the output cross validation table. If there are ties for the interpolation result with highest rank or all results are excluded by exclusion criteria, the layer will not be created even if a value is provided. Warning messages will be returned by the tool if this occurs. | Geostatistical Layer |
Interpolation methods
(オプション) | Specifies the interpolation methods that will be performed on the input features and value field. For each method specified, 1 to 5 interpolation results will be generated. By default, all methods will be generated except inverse distance weighting, radial basis functions, and global polynomial (because these methods cannot create standard errors of predictions). By default, 11 interpolation results will be generated. If all options are specified, 20 interpolation results will be generated.
| String |
Comparison method
(オプション) | Specifies the method that will be used to compare and rank the interpolation results.
| String |
Criterion
(オプション) | Specifies the criterion that will be used to rank the interpolation results.
| String |
Criteria hierarchy
(オプション) | The hierarchy of criteria that will be used for hierarchical sorting with tolerances. Provide multiple criteria in priority order with the first being most important. The interpolation results are ranked by the first criterion, and any ties are broken by the second criterion. Ties in the second criterion are broken by the third criterion, and so on. Cross validation statistics are continuous values and generally do not have exact ties, so tolerances are used to induce ties in the criteria. For each row, specify a criterion in the first column, a tolerance type (percent or absolute) in the second column, and a tolerance value in the third column. If no tolerance value is provided, no tolerance will be used; this is most useful for the final row so that there will be no ties for the interpolation result with highest rank. For each row (level of the hierarchy), the following criteria are available:
For example, you can specify a Root mean square error (Accuracy) value with a 5 percent tolerance in the first row and a Mean error (Bias) value with no tolerance in the second row. These options will first rank the interpolation results by lowest root mean square error (highest prediction accuracy), and all interpolation results whose root mean square error values are within 5 percent of the most accurate result will be considered ties by prediction accuracy. Among the tying results, the result with a mean error closest to zero (lowest bias) will receive the highest rank. | Value Table |
Weighted criteria
(オプション) | The multiple criteria with weights that will be used to rank interpolation results. For each row, provide a criterion and a weight. The interpolation results will be ranked independently by each of the criteria, and a weighted average of the ranks will be used to determine the final ranks of the interpolation results.
| Value Table |
Exclusion criteria
(オプション) | The criteria and associated values that will be used to exclude interpolation results from the comparison. Excluded results will not receive ranks and will have the value No in the Included field of the output cross validation table.
| Value Table |
サマリー
Generates various interpolation results from input point features and a field. The interpolation results are then compared and ranked using customizable criteria based on cross validation statistics.
Interpolation results can be ranked based on a single criterion (such as highest prediction accuracy or lowest bias), weighted average ranks of multiple criteria, or hierarchical sorting of multiple criteria (in which ties by each of the criteria are broken by subsequent criteria in the hierarchy). Exclusion criteria can also be used to exclude interpolation results from the comparison that do not meet minimal quality standards. The output is a table summarizing the cross validation statistics and ranks for each interpolation result. Optionally, you can output a geostatistical layer of the interpolation result with highest rank to be used in further workflows.
図
使用法
Cross validation is a leave-one-out method for evaluating interpolation results. The method sequentially removes each point in the dataset and uses all remaining points to predict the value of the excluded point. The cross validation prediction is then compared to the true value of the hidden point, and the difference between the two is the cross validation error (the error can be positive or negative). The reasoning behind cross validation is that if the interpolation result is effective at predicting the values of the hidden points, it should also be effective at predicting unknown values at new locations, which is the goal of interpolation. All criteria used by this tool are based on summary statistics of the cross validation results.
While assessing interpolation results using cross validation summary statistics is a convenient and effective way to compare multiple interpolation results, it does not replace expert knowledge of the data and interactive investigation of the results. Reviewing charts and individual cross validation errors often reveals patterns in the results that are not obvious from the summary statistics. For example, there are often spatial patterns in the cross validation errors where some areas are underestimated and other areas are overestimated; patterns such as this may not be represented by summary statistics.
Learn more about using cross validation to assess interpolation results
The Comparison method parameter has three options for comparing the cross validation statistics of the interpolation results. Each option has advantages and disadvantages:
- Single criterion—A single criterion is used to compare and rank results. You can rank results by highest prediction accuracy, lowest bias, lowest worst-case error, highest standard error accuracy, or highest precision. The criterion is provided in the Criterion parameter.
- Advantages—This option is a simple and common method for comparing interpolation results that are known to be stable and consistent. It is also useful for choosing between results that are all very similar.
- Disadvantages—Interpolation results frequently perform well by some criteria but not others, for example, by having high prediction accuracy but also high bias. In this case, ranking by a single criterion will assign high ranks to results that are unstable or misleading. When ranking by a single criterion, it is recommended that you use various options of the Exclusion criteria parameter to ensure that unstable or misleading results are removed prior to the comparison.
- Hierarchical sorting with tolerances—Hierarchical sorting is used to compare and rank results. Multiple criteria are specified in priority order (highest priority first) in the Criteria hierarchy parameter. The interpolation results are ranked by the first criterion, and any ties are broken by the second criterion. Ties in the second criterion are broken by the third criterion, and so on. This process is modeled after Custom Sort and hierarchical sorting in spreadsheet software (sort by A, then by B, then by C, and so on). However, cross validation statistics are continuous values and generally do not have exact ties, so tolerances (percent or absolute) can be specified to create ties in each of the criteria.
- Advantages—This option uses multiple criteria, and it takes into account the relative differences of the cross validation statistics. For example, if one interpolation result is much better than the rest by the highest priority criterion, that interpolation result will receive the highest rank regardless of the subsequent criteria in the hierarchy.
- Disadvantages—The effectiveness of hierarchical sorting depends on the provided tolerance values. If tolerances are too small, some criteria may not be used because there are no ties to break. If tolerances are too large, there may be many ties in the rankings due to many results being within the tolerances of each other.
- Weighted average rank—The weighted average rank of multiple criteria are used to compare and rank results. Multiple criteria and associated weights are specified in the Weighted criteria parameter. The interpolation results are ranked independently by each of the criteria, and a weighted average of the ranks is used to determine the final ranks. Criteria with larger weights will have more influence on the final ranks, so they can be used to indicate preference for certain criteria over others.
- Advantages—This option uses multiple criteria, allows for preferences of some criteria over others, and always uses all criteria in the comparison.
- Disadvantages—The relative differences in the values of the cross validation statistics are ignored. For example, all root mean square error values may be within a very small tolerance of each other (indicating that all results have approximately equal prediction accuracy), but they will still be ranked 1 through N by prediction accuracy (for N interpolation results). However, the mean error values may vary by large amounts between the results (indicating the results have large differences in their biases), but they will also be ranked 1 through N by the bias criterion. The weighted average uses only the ranks of the criteria, so the relative differences in the cross validation statistics are ignored in the ranking.
- Single criterion—A single criterion is used to compare and rank results. You can rank results by highest prediction accuracy, lowest bias, lowest worst-case error, highest standard error accuracy, or highest precision. The criterion is provided in the Criterion parameter.
The output is a table summarizing the cross validation statistics, descriptions of the interpolation results, and rankings and can be included in a presentation or report. Cross validation statistics will only be included in the table if they apply to at least one interpolation result. For example, if only inverse distance weighting and radial basis functions are used, the output table will not contain a field of average standard error values because these methods do not calculate standard errors. If a statistic applies to some interpolation results but not others, the value will be null for results to which the statistic does not apply. Additionally, if Empirical Bayesian Kriging is chosen for the Interpolation methods parameter, several cross validation statistics will be included in the table that are not used by any criteria in this tool; these are included for informational purposes and will have null values for all other interpolation methods. If weighted average rank is used, the ranks for all criteria and their weighted average will also be included in the table.
Optionally, you can use the Output geostatistical layer with highest rank parameter to create a geostatistical layer of the interpolation result with the highest rank. This allows you to map the best interpolation result and use it in other workflows.
While the tool is running, geoprocessing messages and progress bar messages display the current interpolation result being calculated. After all results are calculated and compared, the ranks are printed as geoprocessing messages. The ranks are also available in the output cross validation table.
The Compare Geostatistical Layers tool performs the same cross validation comparisons as this tool, but it performs the comparisons on previously created interpolation results (geostatistical layers).
The following table lists the available criteria, the cross validation statistics that measures them, and the formulas used to assign a score to each interpolation result (smaller scores are better). Ranks for the criteria are determined by sorting the scores of each interpolation result.
注意:
For three of the criteria, the score is equal to the cross validation statistic.
Criteria Cross validation statistic Score formula Highest prediction accuracy
Root mean square error
Results are ranked by smallest root mean square error.
Score = RootMeanSquareError
Lowest bias
Mean error
Results are ranked by mean error closest to zero.
Score = AbsoluteValue( MeanError )
Lowest worst-case error
Maximum absolute error
Results are ranked by smallest maximum absolute error.
Score = MaximumAbsoluteError
Highest standard error accuracy
Root mean square standardized error
Results are ranked by root mean square standardized error closest to one.
Score = AbsoluteValue( RMSStdError - 1 )
Highest precision
Average standard error
Results are ranked by smallest average standard error.
Score = AverageStandardError
If there are ties in any criteria, all tying results receive the same rank, equal to the highest of the ranks shared between them (where a higher rank means a smaller rank number). For example, ordered from best to worst, the root mean square error values (12, 14, 14, 15, 16, 16, 18) will receive ranks (1, 2, 2, 4, 5, 5, 7) by the prediction accuracy criterion. Ranks 3 and 6 are skipped due to the tying values.
Ties can occur at various stages of the comparisons. Ties are most common when using hierarchical sorting because all results within the tolerance are considered ties to each other, and all results outside the tolerance are also considered ties to each other. Ties are also common in weighted average rank when the interpolation results have varying ranks by different criteria, which can result in equal weighted averages of the ranks. While uncommon, ties can also occur in single criteria comparisons (for example, if all points have a constant value). Ties by single criteria will also affect weighted average rank if the criteria are used in the weighted average.
In hierarchical sorting, provide the tolerances relative to the score of the criterion rather than the cross validation statistic. For the criteria where the score is equal to the statistic (highest prediction accuracy, lowest worst-case error, and highest precision), appropriate tolerance values are usually clear. For example, if the lowest root mean square error value of the interpolation results is 200, then a 10 percent tolerance will include all results with root mean square error values less than or equal to 220: 200 + (10/100) x 200 = 220. Similarly, an absolute tolerance of 15 will include all results with root mean square error values less than or equal to 215: 200 + 15 = 215.
However, for the criteria where the score is not equal to the value of the statistic (lowest bias and highest standard error accuracy), appropriate tolerance values are less clear. For the mean error statistic, bias is scored by the absolute value of the mean error. This means, for example, that mean error values -4 and 6 have a relative difference of 50 percent because they are 50 percent different in absolute value: ABS(-4) + (50/100) x ABS(-4) = ABS(6). Similarly, their absolute difference is 2: ABS(-4) + 2 = ABS(6).
For the root mean square standardized error statistic, standard error accuracy is scored by the absolute difference between the root mean square standardized error value and the ideal value of 1. This means, for example, that root mean square standardized error values 0.2 and 2.4 have a 75 percent relative difference. To understand why, comparing the values 0.2 and 2.4, the latter is 1.75 times farther away (a 75 percent increase) from the ideal value of 1 than the former (absolute differences of 0.8 and 1.4, respectively): ABS(0.2 - 1) + (75/100) x ABS(0.2 - 1) = ABS(2.4 - 1). Similarly, their absolute difference is 0.6: ABS(0.2 - 1) + 0.6 = ABS(2.4 - 1).
Various criteria require all interpolation results to support the standard error output type. By default, the Interpolation methods parameter options allow all options for all parameters. However, if the Inverse Distance Weighting, Radial Basis Functions, or Global Polynomial Interpolation option is specified, various options of several parameters will become unavailable because these methods cannot calculate standard errors of predictions. The options that are unavailable are related to standard error accuracy, precision, the root mean square standardized error statistic, or the average standard error statistic.
Learn more about which interpolation methods can calculate standard errors of predictions
The Minimum percent error reduction option of the Exclusion criteria parameter is particularly useful when you do not know the values or range of the points being interpolated (for example, in an automated environment). This option excludes interpolation results that are not sufficiently more accurate than a baseline nonspatial model that predicts the global average value at all locations in the map. This relative accuracy is measured by comparing the root mean square error value to the standard deviation of the values of the points being interpolated, and the root mean square error must be at least the specified percent less than the standard deviation to be included in the comparison. For example, a value of 10 means that the root mean square error must be at least 10 percent lower than the standard deviation to be included in the comparison and ranking.
Different disciplines have different standards for acceptable error reductions in interpolation results. In physical sciences with measurements that are densely sampled, errors often reduce by more than 90 percent. In social sciences, however, error reductions of only 10 to 20 percent are often significant to researchers.
Each Interpolation methods parameter option generates between 1 and 5 interpolation results. By default, 11 results are generated. If all options are chosen, 20 results will be generated. The following table shows the 20 possible values of the Description field of the Output cross validation table value along with details about the result. To further investigate any of the results, the third column provides steps to create a geostatistical layer of the result using the Geostatistical Wizard.
注意:
In the instructions for creating the interpolation result, it is assumed that you have opened the Geostatistical Wizard, chosen the interpolation method in the pane on the left, and provided the points and field in the pane on the right. For simple, ordinary, and universal kriging, the kriging type is specified on the second page of the wizard; on the first page, use Kriging/CoKriging for all three types. If the instructions start on a particular page of the wizard, click Next to get to that page without changing any parameters. At the end of the instructions, click Finish and click OK to add the interpolation result to the map.
Field value Description Creation Simple Kriging - Default
A simple kriging model with default parameters. By default, simple kriging uses a transformation.
No changes needed.
Simple Kriging - Optimized
A simple kriging model with optimized parameters.
On the semivariogram page (page 4), click the Optimize model button .
Simple Kriging - Trend
A simple kriging model with trend removal and no transformation.
On the second page, change Transformation type to None, and change Order of trend removal to First.
Simple Kriging - Trend and transformation
A simple kriging model with trend removal and a transformation.
On the second page, change Order of trend removal to First.
Ordinary Kriging - Default
An ordinary kriging model with default parameters.
No changes needed.
Ordinary Kriging - Optimized
An ordinary kriging model with optimized parameters.
On the semivariogram page (page 4), click the Optimize model button .
Universal Kriging - Default
A universal kriging model with first order trend removal and default parameters.
On the second page, change Order of trend removal to First.
Universal Kriging - Optimized
A universal kriging model with first order trend removal and optimized parameters.
On the second page, change Order of trend removal to First. On the semivariogram page (page 4), click the Optimize model button .
Empirical Bayesian Kriging - Default
An empirical Bayesian kriging model with default parameters.
No changes needed.
Empirical Bayesian Kriging - Advanced
An advanced empirical Bayesian kriging model using larger subsets, detrending, and more overlapping and simulations.
On the second page, change the following parameters to the values shown:
- Subset Size—200
- Overlap Factor—300
- Number of Simulations—2
- Transformation—Empirical
- Semivariogram Type—K-Bessel Detrended
Kernel (Local Polynomial) Interpolation
A kernel (local polynomial) interpolation model with default parameters.
No changes needed. Both Kernel Interpolation and Local Polynomial Interpolation are available on the first page. These methods use kernel interpolation, but you should expect similar results from local polynomial interpolation because the methods are similar.
Inverse Distance Weighting - Default
An inverse distance weighting model with a power value equal to 2 (default).
No changes needed.
Inverse Distance Weighting - Optimized
An inverse distance weighting model with an optimized power value.
On the second page, click the Optimize button that appears next to the Power parameter.
Radial Basis Functions - Completely regularized spline
A radial basis functions model using a completely regularized spline kernel function.
No changes needed. Completely regularized spline is the default kernel function for radial basis functions.
Radial Basis Functions - Spline with tension
A radial basis functions model using a spline with tension kernel function.
On the second page, change the Kernel Function parameter to Spline with tension.
Radial Basis Functions - Multiquadric
A radial basis functions model using a multiquadric kernel function.
On the second page, change the Kernel Function parameter to Multiquadric.
Radial Basis Functions - Inverse multiquadric
A radial basis functions model using an inverse multiquadric kernel function.
On the second page, change the Kernel Function parameter to Inverse multiquadric.
Radial Basis Functions - Thin plate spline
A radial basis functions model using a thin plate spline kernel function.
On the second page, change the Kernel Function parameter to Thin plate spline.
Global Polynomial Interpolation - Second order
A global polynomial interpolation model with second order (quadratic) trend.
On the second page, change the Order of polynomial parameter to 2.
Global Polynomial Interpolation - Third order
A global polynomial interpolation model with third order (cubic) trend.
On the second page, change the Order of polynomial parameter to 3.
パラメーター
arcpy.ga.ExploratoryInterpolation(in_features, value_field, out_cv_table, {out_geostat_layer}, {interp_methods}, {comparison_method}, {criterion}, {criteria_hierarchy}, {weighted_criteria}, {exclusion_criteria})
名前 | 説明 | データ タイプ |
in_features | The input points representing locations of points to be interpolated. | Feature Layer |
value_field | The field containing the values to be interpolated. | Field |
out_cv_table | The output table containing cross validation statistics and ranks for each interpolation result. The final ranks of the interpolation results are stored in the RANK field. | Table |
out_geostat_layer (オプション) | The output geostatistical layer of the interpolation result with highest rank. This interpolation result will have the value 1 in the RANK field of the output cross validation table. If there are ties for the interpolation result with highest rank or all results are excluded by exclusion criteria, the layer will not be created even if a value is provided. Warning messages will be returned by the tool if this occurs. | Geostatistical Layer |
interp_methods [interp_methods,...] (オプション) | Specifies the interpolation methods that will be performed on the input features and value field. For each method specified, 1 to 5 interpolation results will be generated. By default, all methods will be generated except inverse distance weighting, radial basis functions, and global polynomial (because these methods cannot create standard errors of predictions). By default, 11 interpolation results will be generated. If all options are specified, 20 interpolation results will be generated.
| String |
comparison_method (オプション) | Specifies the method that will be used to compare and rank the interpolation results.
| String |
criterion (オプション) | Specifies the criterion that will be used to rank the interpolation results.
| String |
criteria_hierarchy [[criteria1, tol_type1, tol_val1], [criteria2, tol_type2, tol_val2],...] (オプション) | The hierarchy of criteria that will be used for hierarchical sorting with tolerances. Provide multiple criteria in priority order with the first being most important. The interpolation results are ranked by the first criterion, and any ties are broken by the second criterion. Ties in the second criterion are broken by the third criterion, and so on. Cross validation statistics are continuous values and generally do not have exact ties, so tolerances are used to induce ties in the criteria. For each row, specify a criterion in the first column, a tolerance type (percent or absolute) in the second column, and a tolerance value in the third column. If no tolerance value is provided, no tolerance will be used; this is most useful for the final row so that there will be no ties for the interpolation result with highest rank. For each row (level of the hierarchy), the following criteria are available:
For example, you can specify an ACCURACY value with a 5 percent tolerance in the first row and a BIAS value with no tolerance in the second row. These options will first rank the interpolation results by lowest root mean square error (highest prediction accuracy), and all interpolation results whose root mean square error values are within 5 percent of the most accurate result will be considered ties by prediction accuracy. Among the tying results, the result with a mean error closest to zero (lowest bias) will receive the highest rank. | Value Table |
weighted_criteria [[criteria1, weight1], [criteria2, weight2],...] (オプション) | The multiple criteria with weights that will be used to rank interpolation results. For each row, provide a criterion and a weight. The interpolation results will be ranked independently by each of the criteria, and a weighted average of the ranks will be used to determine the final ranks of the interpolation results.
| Value Table |
exclusion_criteria [[criteria1, value1], [criteria2, value2],...] (オプション) | The criteria and associated values that will be used to exclude interpolation results from the comparison. Excluded results will not receive ranks and will have the value No in the Included field of the output cross validation table.
| Value Table |
コードのサンプル
The following Python script demonstrates how to use the ExploratoryInterpolation function.
# Interpolate points using Simple Kriging, Universal Kriging, and EBK
# Rank results by highest prediction accuracy
# Exclude results with error reductions under 25%
inPoints = "myPoints"
inField = "myField"
outTable = "outCVtable"
outGALayer = "Result With Highest Rank"
interpMethods = ["SIMPLE_KRIGING", "UNIVERSAL_KRIGING", "EBK"]
compMethod = "SINGLE"
criterion = "ACCURACY"
exclCrit = [["MIN_PERC_ERROR", 25]]
arcpy.ga.ExploratoryInterpolation(inPoints, inField, outTable, outGALayer,
interpMethods, compMethod, criterion, None, None, exclCrit)
The following Python script demonstrates how to use the ExploratoryInterpolation function.
# Interpolate points and a field using various interpolation methods
# Rank results by highest weighted average rank
# Rank same results by hierarchical sorting
# Import system modules
import arcpy
# Check out the ArcGIS Geostatistical Analyst extension license
arcpy.CheckOutExtension("GeoStats")
# Allow overwriting output
arcpy.env.overwriteOutput = True
### Set shared parameters
# Set input and output locations
directory = "C:/data/"
ingdb = directory + "data.gdb/"
outgdb = directory + "out.gdb/"
arcpy.env.workspace = directory
# Input points
inPoints = ingdb + "myPoints"
# Input field
inField = "myField"
# List of interpolation methods
interpMethods = ["SIMPLE_KRIGING", "UNIVERSAL_KRIGING", "EBK"]
# Exclude results with error reductions under 25%
exclCrit = [["MIN_PERC_ERROR", 25]]
# Output geostatistical layer with highest rank
outGALayer = "Result With Highest Rank"
### Set weighted average rank parameters
# Output table of ranks and cross validation results
outTable = directory + "outWeightedAverageTable"
# Use weighted average rank
compMethod = "AVERAGE_RANK"
# Use all criteria with highest weight to prediction accuracy
weightedCrit = [
["ACCURACY", 3],
["BIAS", 1],
["WORST_CASE", 1],
["STANDARD_ERROR", 1],
["PRECISION", 1]
]
# Compare using weighted average rank
arcpy.ga.ExploratoryInterpolation(inPoints, inField, outTable, outGALayer,
interpMethods, compMethod, None, None, weightedCrit, exclCrit)
### Set hierarchical sorting parameters
# Output table of ranks and cross validation results
outTable = directory + "outHierSortTable"
# Use hierarchical sorting with tolerances
compMethod = "SORTING"
# Compare using highest prediction accuracy with a 10% tolerance
# Break ties by lowest bias
hierCrit = [
["ACCURACY", "PERCENT", 10],
["BIAS", "PERCENT", None]
]
# Compare using hierarchical sorting with tolerances
arcpy.ga.ExploratoryInterpolation(inPoints, inField, outTable, outGALayer,
interpMethods, compMethod, None, hierCrit, None, exclCrit)
ライセンス情報
- Basic: 次のものが必要 Geostatistical Analyst
- Standard: 次のものが必要 Geostatistical Analyst
- Advanced: 次のものが必要 Geostatistical Analyst