Subset Features (Geostatistical Analyst)

Available with Geostatistical Analyst license.

Summary

Divides the original dataset into two parts: one part to be used to model the spatial structure and produce a surface, the other to be used to compare and validate the output surface.

Usage

  • If multipart features are used as input, the output will be a subset of multipart features and not individual features.

  • If you want the random sequence used to create the subsets to be repeatable, you need to specify a nonzero seed value in the Random number generator environment variable.

    Note:

    Only the Mersenne Twister random number generator type is supported; if ACM collected algorithm 599 or Standard C Rand is chosen, Mersenne Twister will be used instead.

  • The test feature class is often used in validation of a model created using the training feature class.

Parameters

LabelExplanationData Type
Input features

Points, lines, polygon features, or table from which to create a subset.

Table View
Output training feature class

The subset of training features to be created.

Feature Class; Table
Output test feature class
(Optional)

The subset of test features to be created.

Feature Class; Table
Size of training feature subset
(Optional)

The size of the output training feature class, entered either as a percentage of the input features or as an absolute number of features.

Double
Subset size units
(Optional)

Type of subset size.

  • Percentage of input The percentage of the input features that will be in the training dataset.
  • Absolute value The number of features that will be in the training dataset.
Boolean

arcpy.ga.SubsetFeatures(in_features, out_training_feature_class, {out_test_feature_class}, {size_of_training_dataset}, {subset_size_units})
NameExplanationData Type
in_features

Points, lines, polygon features, or table from which to create a subset.

Table View
out_training_feature_class

The subset of training features to be created.

Feature Class; Table
out_test_feature_class
(Optional)

The subset of test features to be created.

Feature Class; Table
size_of_training_dataset
(Optional)

The size of the output training feature class, entered either as a percentage of the input features or as an absolute number of features.

Double
subset_size_units
(Optional)

Type of subset size.

  • PERCENTAGE_OF_INPUT The percentage of the input features that will be in the training dataset.
  • ABSOLUTE_VALUE The number of features that will be in the training dataset.
Boolean

Code sample

SubsetFeatures example 1 (Python window)

Randomly split the features into two feature classes.

import arcpy
arcpy.env.workspace = "C:/gapyexamples/data"
arcpy.SubsetFeatures_ga("ca_ozone_pts", "C:/gapyexamples/output/training", 
                        "", "", "PERCENTAGE_OF_INPUT")
SubsetFeatures example 2 (stand-alone script)

Randomly split the features into two feature classes.

# Name: SubsetFeatures_Example_02.py
# Description: Randomly split the features into two feature classes.
# Requirements: Geostatistical Analyst Extension

# Import system modules
import arcpy

# Set environment settings
arcpy.env.workspace = "C:/gapyexamples/data"

# Set local variables
inPointFeatures = "ca_ozone_pts.shp"
outtrainPoints = "C:/gapyexamples/output/training.shp"
outtestPoints = ""
trainData = ""
subsizeUnits = "PERCENTAGE_OF_INPUT"

# Execute SubsetFeatures
arcpy.SubsetFeatures_ga(inPointFeatures, outtrainPoints, outtestPoints, 
                        trainData, subsizeUnits)

Licensing information

  • Basic: Requires Geostatistical Analyst
  • Standard: Requires Geostatistical Analyst
  • Advanced: Requires Geostatistical Analyst

Related topics