| Label | Explanation | Data Type | 
| Input Raster | The input source imagery, typically multispectral imagery. Examples of the types of input source imagery include multispectral satellite, drone, aerial, and National Agriculture Imagery Program (NAIP). The input can be a folder of images. | Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Map Server; Map Server Layer; Internet Tiled Layer; Folder | 
| Output Folder | The folder where the output image chips and metadata will be stored. The folder can also be a folder URL that uses a cloud storage connection file (*.acs). | Folder | 
| Input Feature Class Or Classified Raster Or Table | The training sample data in either vector or raster form. Vector inputs should follow the training sample format generated using the Training Samples Manager pane. Raster inputs should follow a classified raster format generated by the Classify Raster tool. The raster input can also be from a folder of classified rasters. Classified raster inputs require a corresponding raster attribute table. Input tables should follow a training sample format generated by the Label Objects for Deep Learning button in the Training Samples Manager pane. Following the proper training sample format will produce optimal results with the statistical information; however, the input can also be a point feature class without a class value field or an integer raster without class information. | Feature Class; Feature Layer; Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Table; Folder | 
| Image Format | Specifies the raster format that will be used for the image chip outputs. The PNG and JPEG formats support up to three bands. 
 | String | 
| Tile Size X (Optional) | The size of the image chips for the x dimension. | Long | 
| Tile Size Y (Optional) | The size of the image chips for the y dimension. | Long | 
| Stride X (Optional) | The distance to move in the x direction when creating the next image chips. When stride is equal to tile size, there will be no overlap. When stride is equal to half the tile size, there will be 50 percent overlap. | Long | 
| Stride Y (Optional) | The distance to move in the y direction when creating the next image chips. When stride is equal to tile size, there will be no overlap. When stride is equal to half the tile size, there will be 50 percent overlap. | Long | 
| Output No Feature Tiles (Optional) | Specifies whether image chips that do not capture training samples will be exported. 
 If checked, image chips that do not capture labeled data will also be exported; if not checked, they will not be exported. | Boolean | 
| Metadata Format (Optional) | Specifies the format that will be used for the output metadata labels. If the input training sample data is a feature class layer, such as a building layer or a standard classification training sample file, use the KITTI Labels or PASCAL Visual Object Classes option (KITTI_rectangles or PASCAL_VOC_rectangles in Python). The output metadata is a .txt file or an .xml file containing the training sample data contained in the minimum bounding rectangle. The name of the metadata file matches the input source image name. If the input training sample data is a class map, use the Classified Tiles option (Classified_Tiles in Python) as the output metadata format. For the KITTI metadata format, 15 columns are created, but only 5 of them are used in the tool. The first column is the class value. The next 3 columns are skipped. Columns 5 through 8 define the minimum bounding rectangle, which is composed of four image coordinate locations: left, top, right, and bottom pixels. The minimum bounding rectangle encompasses the training chip used in the deep learning classifier. The remaining columns are not used. 
 | String | 
| Start Index (Optional) | Legacy:This parameter has been deprecated. | Long | 
| Class Value Field
 (Optional) | The field that contains the class values. If no field is specified, the system searches for a value or classvalue field. The field should be numeric, usually an integer. If the feature does not contain a class field, the system determines that all records belong to one class. | Field | 
| Buffer Radius (Optional) | The radius of a buffer around each training sample that will be used to delineate a training sample area. This allows you to create circular polygon training samples from points. The linear unit of the Input Feature Class Or Classified Raster Or Table parameter value's spatial reference is used. | Double | 
| Input Mask Polygons (Optional) | A polygon feature class that delineates the area where image chips will be created. Only image chips that fall completely within the polygons will be created. | Feature Layer | 
| Rotation Angle (Optional) | The rotation angle that will be used to generate image chips. An image chip will first be generated with no rotation. It will then be rotated at the specified angle to create additional image chips. The image will be rotated and have a chip created, until it has been fully rotated. For example, if you specify a rotation angle of 45 degrees, the tool will create eight image chips. The eight image chips will be created at the following angles: 0, 45, 90, 135, 180, 25, 270, and 315. The default rotation angle is 0, which creates one default image chip. | Double | 
| Reference System (Optional) | Specifies the type of reference system that will be used to interpret the input image. The reference system specified must match the reference system used to train the deep learning model. 
 | String | 
| Processing Mode (Optional) | Specifies how all raster items in a mosaic dataset or an image service will be processed. This parameter is applied when the input raster is a mosaic dataset or an image service. 
 | String | 
| Blacken Around Feature (Optional) | Specifies whether the pixels around each object or feature in each image tile will be masked out. This parameter only applies when the Metadata Format parameter is set to Labeled Tiles and an input feature class or classified raster has been specified. 
 | Boolean | 
| Crop Mode (Optional) | Specifies whether the exported tiles will be cropped so that they are all the same size. This parameter only applies when the Metadata Format parameter is set to either Labeled Tiles or Imagenet, and an input feature class or classified raster has been specified. 
 | String | 
| Additional Input Raster (Optional) | An additional input imagery source that will be used for image translation methods. This parameter is valid when the Metadata Format parameter is set to Classified Tiles, Export Tiles, or CycleGAN. | Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Map Server; Map Server Layer; Internet Tiled Layer; Folder | 
| Instance Feature Class
 (Optional) | The training sample data collected that contains classes for instance segmentation. The input can also be a point feature class without a class value field or an integer raster without class information. This parameter is only valid when the Metadata Format parameter is set to Panoptic Segmentation. | Feature Class; Feature Layer; Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Table; Folder | 
| Instance Class Value Field
 (Optional) | The field that contains the class values for instance segmentation. If no field is specified, the tool will use a value or class value field if one is present. If the feature does not contain a class field, the tool will determine that all records belong to one class. This parameter is only valid when the Metadata Format parameter is set to Panoptic Segmentation. | Field | 
| Minimum Polygon Overlap Ratio
 (Optional) | The minimum overlap percentage for a feature to be included in the training data. If the percentage overlap is less than the value specified, the feature will be excluded from the training chip, and will not be added to the label file. The percent value is expressed as a decimal. For example, to specify an overlap of 20 percent, use a value of 0.2. The default value is 0, which means that all features will be included. This parameter improves the performance of the tool and also improves inferencing. The speed is improved since less training chips are created. The inferencing is improved since the model is trained to only detect large patches of objects and ignores small corners of features. This means less false positives will be detected, and less false positives will be removed by the Non Maximum Suppression tool. This parameter is active when the Input Feature Class Or Classified Raster Or Table parameter value is a feature class. | Double | 
Available with Spatial Analyst license.
Available with Image Analyst license.
Summary
Converts labeled vector or raster data into deep learning training datasets using a remote sensing image. The output will be a folder of image chips and a folder of metadata files in the specified format.
Usage
- This tool will create training datasets to support third-party deep learning applications, such as Google TensorFlow, Keras, PyTorch, Microsoft CNTK, and others. 
- Deep learning class training samples are based on small subimages, called image chips, that contain the feature or class of interest. 
- Use your existing classification training sample data or GIS feature class data, such as a building footprint layer, to generate image chips containing the class sample from the source image. Image chips are often 256 pixel rows by 256 pixel columns, unless the training sample size is larger. Each image chip can contain one or more objects. If the Labeled Tiles parameter metadata format is used, there can be only one object per image chip. 
- By specifying the Reference System parameter value, training data can be exported in map space or pixel space (raw image space) to use for deep learning model training. 
- This tool supports exporting training data from a collection of images. You can add an image folder as the Input Raster value. If the Input Raster value is a mosaic dataset or an image service, you can also specify that the Processing Mode parameter process the mosaic as either one input or each raster item separately. 
- The cell size and extent can be adjusted using the geoprocessing environment settings. 
- This tool honors the Parallel Processing Factor environment setting. By default, Parallel Processing Factor is not enabled; consequently, the tool will run on a single core. When large datasets are used, enable Parallel Processing Factor by specifying the number of cores the tool can use to distribute the workload. 
- For information about requirements for running this tool and issues you may encounter, see Deep Learning frequently asked questions. 
Parameters
ExportTrainingDataForDeepLearning(in_raster, out_folder, in_class_data, image_chip_format, {tile_size_x}, {tile_size_y}, {stride_x}, {stride_y}, {output_nofeature_tiles}, {metadata_format}, {start_index}, {class_value_field}, {buffer_radius}, {in_mask_polygons}, {rotation_angle}, {reference_system}, {processing_mode}, {blacken_around_feature}, {crop_mode}, {in_raster2}, {in_instance_data}, {instance_class_value_field}, {min_polygon_overlap_ratio})| Name | Explanation | Data Type | 
| in_raster | The input source imagery, typically multispectral imagery. Examples of the types of input source imagery include multispectral satellite, drone, aerial, and National Agriculture Imagery Program (NAIP). The input can be a folder of images. | Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Map Server; Map Server Layer; Internet Tiled Layer; Folder | 
| out_folder | The folder where the output image chips and metadata will be stored. The folder can also be a folder URL that uses a cloud storage connection file (*.acs). | Folder | 
| in_class_data | The training sample data in either vector or raster form. Vector inputs should follow the training sample format generated using the Training Samples Manager pane. Raster inputs should follow a classified raster format generated by the Classify Raster tool. The raster input can also be from a folder of classified rasters. Classified raster inputs require a corresponding raster attribute table. Input tables should follow a training sample format generated by the Label Objects for Deep Learning button in the Training Samples Manager pane. Following the proper training sample format will produce optimal results with the statistical information; however, the input can also be a point feature class without a class value field or an integer raster without class information. | Feature Class; Feature Layer; Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Table; Folder | 
| image_chip_format | Specifies the raster format that will be used for the image chip outputs. The PNG and JPEG formats support up to three bands. 
 | String | 
| tile_size_x (Optional) | The size of the image chips for the x dimension. | Long | 
| tile_size_y (Optional) | The size of the image chips for the y dimension. | Long | 
| stride_x (Optional) | The distance to move in the x direction when creating the next image chips. When stride is equal to tile size, there will be no overlap. When stride is equal to half the tile size, there will be 50 percent overlap. | Long | 
| stride_y (Optional) | The distance to move in the y direction when creating the next image chips. When stride is equal to tile size, there will be no overlap. When stride is equal to half the tile size, there will be 50 percent overlap. | Long | 
| output_nofeature_tiles (Optional) | Specifies whether image chips that do not capture training samples will be exported. 
 | Boolean | 
| metadata_format (Optional) | Specifies the format that will be used for the output metadata labels. If the input training sample data is a feature class layer, such as a building layer or a standard classification training sample file, use the KITTI Labels or PASCAL Visual Object Classes option (KITTI_rectangles or PASCAL_VOC_rectangles in Python). The output metadata is a .txt file or an .xml file containing the training sample data contained in the minimum bounding rectangle. The name of the metadata file matches the input source image name. If the input training sample data is a class map, use the Classified Tiles option (Classified_Tiles in Python) as the output metadata format. 
 For the KITTI metadata format, 15 columns are created, but only 5 of them are used in the tool. The first column is the class value. The next 3 columns are skipped. Columns 5 through 8 define the minimum bounding rectangle, which is composed of four image coordinate locations: left, top, right, and bottom pixels. The minimum bounding rectangle encompasses the training chip used in the deep learning classifier. The remaining columns are not used. The following is an example of the PASCAL_VOC_rectangles option: For more information, see the Microsoft PASCAL Visual Object Classes (VOC) Challenge paper. | String | 
| start_index (Optional) | Legacy:This parameter has been deprecated. Use a value of 0 or # in Python. | Long | 
| class_value_field (Optional) | The field that contains the class values. If no field is specified, the system searches for a value or classvalue field. The field should be numeric, usually an integer. If the feature does not contain a class field, the system determines that all records belong to one class. | Field | 
| buffer_radius (Optional) | The radius of a buffer around each training sample that will be used to delineate a training sample area. This allows you to create circular polygon training samples from points. The linear unit of the in_class_data parameter value's spatial reference is used. | Double | 
| in_mask_polygons (Optional) | A polygon feature class that delineates the area where image chips will be created. Only image chips that fall completely within the polygons will be created. | Feature Layer | 
| rotation_angle (Optional) | The rotation angle that will be used to generate image chips. An image chip will first be generated with no rotation. It will then be rotated at the specified angle to create additional image chips. The image will be rotated and have a chip created, until it has been fully rotated. For example, if you specify a rotation angle of 45 degrees, the tool will create eight image chips. The eight image chips will be created at the following angles: 0, 45, 90, 135, 180, 25, 270, and 315. The default rotation angle is 0, which creates one default image chip. | Double | 
| reference_system (Optional) | Specifies the type of reference system that will be used to interpret the input image. The reference system specified must match the reference system used to train the deep learning model. 
 | String | 
| processing_mode (Optional) | Specifies how all raster items in a mosaic dataset or an image service will be processed. This parameter is applied when the input raster is a mosaic dataset or an image service. 
 | String | 
| blacken_around_feature (Optional) | Specifies whether the pixels around each object or feature in each image tile will be masked out. This parameter only applies when the metadata_format parameter is set to Labeled_Tiles and an input feature class or classified raster has been specified. 
 | Boolean | 
| crop_mode (Optional) | Specifies whether the exported tiles will be cropped so that they are all the same size. This parameter only applies when the metadata_format parameter is set to either Labeled_Tiles or Imagenet, and an input feature class or classified raster has been specified. 
 | String | 
| in_raster2 (Optional) | An additional input imagery source that will be used for image translation methods. This parameter is valid when the metadata_format parameter is set to Classified_Tiles, Export_Tiles, or CycleGAN. | Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Map Server; Map Server Layer; Internet Tiled Layer; Folder | 
| in_instance_data (Optional) | The training sample data collected that contains classes for instance segmentation. The input can also be a point feature class without a class value field or an integer raster without class information. This parameter is only valid when the metadata_format parameter is set to Panoptic_Segmentation. | Feature Class; Feature Layer; Raster Dataset; Raster Layer; Mosaic Layer; Image Service; Table; Folder | 
| instance_class_value_field (Optional) | The field that contains the class values for instance segmentation. If no field is specified, the tool will use a value or class value field if one is present. If the feature does not contain a class field, the tool will determine that all records belong to one class. This parameter is only valid when the metadata_format parameter is set to Panoptic_Segmentation. | Field | 
| min_polygon_overlap_ratio (Optional) | The minimum overlap percentage for a feature to be included in the training data. If the percentage overlap is less than the value specified, the feature will be excluded from the training chip, and will not be added to the label file. The percent value is expressed as a decimal. For example, to specify an overlap of 20 percent, use a value of 0.2. The default value is 0, which means that all features will be included. This parameter improves the performance of the tool and also improves inferencing. The speed is improved since less training chips are created. The inferencing is improved since the model is trained to only detect large patches of objects and ignores small corners of features. This means less false positives will be detected, and less false positives will be removed by the Non Maximum Suppression tool. This parameter is enabled when the in_class_data parameter value is a feature class. | Double | 
Code sample
This example creates training samples for deep learning.
# Import system modules
import arcpy
from arcpy.sa import *
# Check out the ArcGIS Image Analyst extension license
arcpy.CheckOutExtension("spatialAnalyst")
ExportTrainingDataForDeepLearning("c:/test/image.tif", "c:/test/outfolder",
             "c:/test/training.shp", "TIFF", "256", "256", "128", "128", 
             "ONLY_TILES_WITH_FEATURES", "Labeled_Tiles", 0, "Classvalue", 0, 
			 None, 0,  "MAP_SPACE", "PROCESS_AS_MOSAICKED_IMAGE", "NO_BLACKEN", 
			 "FIXED_SIZE")This example creates training samples for deep learning.
# Import system modules and check out ArcGIS Image Analyst extension license
import arcpy
arcpy.CheckOutExtension("SpatialAnalyst")
from arcpy.sa import *
# Set local variables
inRaster = "C:/test/InputRaster.tif"
out_folder = "c:/test/OutputFolder"
in_training = "c:/test/TrainingData.shp"
image_chip_format = "TIFF"
tile_size_x = "256"
tile_size_y = "256"
stride_x="128"
stride_y="128"
output_nofeature_tiles="ONLY_TILES_WITH_FEATURES"
metadata_format="Labeled_Tiles"
start_index = 0
classvalue_field = "Classvalue"
buffer_radius = 0
in_mask_polygons = "MaskPolygon"
rotation_angle = 0
reference_system = "MAP_SPACE"
processing_mode = "PROCESS_AS_MOSAICKED_IMAGE"
blacken_around_feature = "NO_BLACKEN"
crop_mode = "FIXED_SIZE"
# Execute 
ExportTrainingDataForDeepLearning(inRaster, out_folder, in_training, 
             image_chip_format,tile_size_x, tile_size_y, stride_x, 
             stride_y,output_nofeature_tiles, metadata_format, start_index, 
			 classvalue_field, buffer_radius, in_mask_polygons, rotation_angle, 
			 reference_system, processing_mode, blacken_around_feature, crop_mode)Environments
Licensing information
- Basic: Requires Spatial Analyst or Image Analyst
- Standard: Requires Spatial Analyst or Image Analyst
- Advanced: Requires Spatial Analyst or Image Analyst