Label objects for deep learning—ArcGIS Pro

Disponible avec une licence Image Analyst.

Disponible avec une licence Spatial Analyst.

All supervised deep learning tasks depend on labeled datasets, which means humans must apply their knowledge to train the neural network on what it is working to identify. The labeled objects will be used by the neural network to train a model that can be used to perform inferencing on data.

Image annotation, or labeling, is vital for deep learning tasks such as computer vision and learning. A large amount of labeled data is required to train a good deep learning model. When the right training data is available, deep learning systems can be highly accurate in feature extraction, pattern recognition, and complex problem solving. The Label Objects for Deep Learning pane can be used to quickly and accurately label data.

The Label Objects for Deep Learning button is found in the Classification Tools drop-down menu, in the Image Classification group on the Imagery tab. The pane is divided into two parts. The top part of the pane is used for managing classes, and the bottom part of the pane is used for managing the collection of the samples and for exporting the training data for the deep learning frameworks.

Create classes and label objects

The top portion of the pane allows you to manage object classes and manually create the objects used for training the deep learning model. There are many tools available to help you create labeled objects.


Tool	Function
	Create a labeled object by drawing a rectangle around a feature or object in the raster.
	Create a labeled object by drawing a polygon around a feature or object in the raster.
	Create a labeled object by drawing a circle around a feature or object in the raster.
	Create a labeled object by drawing a freehand shape around a feature or object in the raster.
	Create a feature by selecting a segment from a segmented layer. This option is only available if there is a segmented layer in the Contents pane. Activate the Segment Picker by highlighting the segmented layer in the Contents pane, then select the layer from the Segment Picker drop-down list.
	Assigns the selected class to the current image. This is only available in Image Collection mode.
	Select and edit a labeled object.
	Create a classification schema.
	Select a classification schema option. Browse to an existing schema. Generate a new schema from an existing training sample feature class. Generate a new schema from an existing classified raster. Use the default 2011 National Land Cover Database schema.
	Save changes to the schema.
	Save a new copy of the schema.
	Add a class category to the schema. Select the name of the schema first to create a parent class at the highest level. Select the name of an existing class to create a subclass.
	Remove the selected class or subclass category from the schema.

Click one of the sketch tools, such as Rectangle, Polygon, Circle, or Freehand, to begin collecting object samples.
Using a sketch tool, delineate the image feature representing the object on the map.
1. If you are creating a feature without a class specified, the Define Class dialog box appears. For more information about this dialog box, see the Define Class section.
Continue to create and label objects as specified in the steps above.
You can use the Labeled Objects tab (at the bottom of the pane) to delete and organize your labeled object samples.
Once you are satisfied with all your labeled objects, save your samples by clicking the Save button on the Labeled Objects tab.

Now that you have manually labeled a representative sample of objects, these can be used to export your training data.

Define Class

The Define Class dialog box allows you to create a new class or define an existing class. If you choose Use Existing Class, select the appropriate Class Name option for that object. If you choose Add New Class, you can optionally edit the information and click OK to create the new class.

Label image collections

If you have an image collection, or want to label individual images within a mosaic dataset, use the Image Collection tab. For more information about image collections, see Mosaic datasets.

Using the mosaic layer, you can label each of the images. Using the Image Collection tab, you can access the list of images in the drop-down list. The selected image will draw in the map. You can then label the image with the appropriate class. Use the arrow buttons to choose the next image you want to view and label.

When your image is in an Image Coordinate System (ICS), the image may be in an unusual orientation, especially when dealing with oblique or perspective imagery. To view your image in pixel space, check the Label in pixel space check box. This will draw the image in an orientation more conducive to intuitive image interpretation.

Label the entire image

For instances when you don't want to draw a boundary around an object, you can use the Label Image button Étiqueter l’image to label the entire image with the selected class, irrespective of the spatial aspect of the object.

Labeled Objects

The Labeled Objects tab is located in the bottom section of the pane and manages the training samples you have collected for each class. Collect representative sites, or training samples, for each class in the image. A training sample has location information (polygon) and an associated class. The image classification algorithm uses the training samples, saved as a feature class, to identify the land cover classes in the entire image.

You can view and manage training samples by adding, grouping, or removing them. When you select a training sample, it is selected on the map. Double-click a training sample in the table to zoom to it on the map.


Tool	Function
	Open an existing training samples feature class.
	Save edits made to the current labeled objects feature class.
	Save the current labeled objects as a new feature class.
	Delete the selected labeled objects.

Export Training Data

Once samples have been collected, you can export them into training data by clicking the Export Training Data tab. The training data can then be used in a deep learning model. Once the parameters have been filled in, click Run to create the training data.


Parameter	Description
Output Folder	Choose the output folder where the training data will be saved.
Mask Polygon Features	Classe d’entités surfaciques qui délimite la zone de création des fragments d’images. Seuls les fragments d’images totalement inclus dans les polygones seront créés.
Image Format	Specifies the raster format for the image chip outputs. TIFF. This is the default. MRF (Meta Raster Format). PNG. JPEG. Les formats PNG et JPEG prennent en charge jusqu’à trois canaux.
Tile Size X	Taille des fragments d’images pour la dimension x.
Tile Size Y	Taille des fragments d’images pour la dimension y.
Stride X	Distance de déplacement sur la direction x lors de la création des fragments d’image suivants. Si le pas est égal à la taille de tuile, il n’y a pas de superposition. Si le pas est égal à la moitié de la taille de tuile, il y a une superposition de 50 %.
Stride Y	Distance de déplacement sur la direction y lors de la création des fragments d’image suivants. Si le pas est égal à la taille de tuile, il n’y a pas de superposition. Si le pas est égal à la moitié de la taille de tuile, il y a une superposition de 50 %.
Rotation Angle	The rotation angle that will be used to generate additional image chips. An image chip will be generated with a rotation angle of 0, which means no rotation. It will then be rotated at the specified angle to create an additional image chip. The same training samples will be captured at multiple angles in multiple image chips for data augmentation. The default rotation angle is 0.
Output No Feature Tiles	Spécifie si les fragments d’images qui ne capturent pas d’échantillons d’apprentissage seront exportés. Unchecked—Only image chips that capture training samples will be exported. This is the default. Checked—All image chips, including those that do not capture training samples, will be exported.
Metadata format	Indique le format utilisé pour les étiquettes de métadonnées en sortie. Si les données d’échantillons d’apprentissage en entrée sont une couche de classe d’entités, telle qu’une couche d’emprise de bâtiment ou un fichier d’échantillon d’apprentissage de classification standard, utilisez l’option Étiquettes KITTI ou PASCAL Visual Object Classes (KITTI_rectangles ou PASCAL_VOC_rectangles dans Python). Les métadonnées en sortie sont un fichier .txt ou un fichier .xml comportant les données d’échantillons d’apprentissage contenues dans le rectangle d’emprise minimale. Le nom du fichier de métadonnées correspond à celui de l’image source en entrée. Si les données d’échantillons d’apprentissage en entrée sont une carte de classe, utilisez l’option Tuiles classées (Classified_Tiles dans Python) comme format de métadonnées en sortie. KITTI Labels—The metadata follows the same format as the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) Object Detection Evaluation dataset. The KITTI dataset is a vision benchmark suite. The label files are plain text files. All values, both numerical and strings, are separated by spaces, and each row corresponds to one object. PASCAL Visual Object Classes—The metadata follows the same format as the Pattern Analysis, Statistical Modeling and Computational Learning Visual Object Classes (PASCAL VOC) dataset. The PASCAL VOC dataset is a standardized image dataset for object class recognition. The label files are XML files and contain information about image name, class value, and bounding boxes. This is the default. Classified Tiles—The output will be one classified image chip per input image chip. No other metadata for each image chip is used. Only the statistics output has more information on the classes, such as class names, class values, and output statistics. RCNN Masks—The output will be image chips that have a mask on the areas where the sample exists. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone in the deep learning framework model. Labeled Tiles—Each output tile will be labeled with a specific class. If you choose this metadata format, you can additionally refine the Blacken Around Feature and Crop Mode parameters. Multi-labeled Tiles —Each output tile will be labeled with one or more classes. For example, a tile may be labeled agriculture and also cloudy. This format is used for object classification. Export Tiles —The output will be image chips with no label. This format is used for image translation techniques, such as Pix2Pix and Super Resolution. CycleGAN —The output will be image chips with no label. This format is used for image translation technique CycleGAN, which is used to train images that do not overlap. Imagenet—Each output tile will be labeled with a specific class. This format is used for object classification; however, it can also be used for object tracking when the Deep Sort model type is used during training. Pour le format de métadonnées KITTI, 15 colonnes sont créées, mais seulement 5 d’entre elles sont utilisées dans l’outil. La première colonne est la valeur de classe. Les 3 colonnes suivantes sont ignorées. Les colonnes 5 à 8 définissent le rectangle d’emprise minimale comportant quatre emplacements de coordonnées d’image : pixels de gauche, en haut, de droite et en bas. Le rectangle d’emprise minimale englobe le fragment d’apprentissage utilisé dans le classificateur d’apprentissage en profondeur. Les autres colonnes ne sont pas utilisées.
Blacken Around Feature	Indique si les pixels autour de chaque objet ou entité de chaque tuile d’images sont masqués. Désactivé : les pixels entourant les objets ou les entités ne sont pas masqués. Il s’agit de l’option par défaut. Activé : les pixels entourant les objets ou les entités sont masqués. Ce paramètre s’applique uniquement lorsque le paramètre Format de métadonnées est défini sur Tuiles étiquetées et qu’une classe d’entités en entrée ou qu’un raster classé a été spécifié.
Crop Mode	Spécifie si les tuiles exportées sont rognées de telle sorte qu’elles fassent toutes la même taille. Fixed size—Exported tiles will be the same size and will center on the feature. This is the default. Bounding box—Exported tiles will be cropped so that the bounding geometry surrounds only the feature in the tile. Ce paramètre s’applique uniquement lorsque le paramètre Format de métadonnées a pour valeur Tuiles étiquetées ou Imagenet et qu’une classe d’entités en entrée ou qu’un raster classé a été spécifié.
Reference System	Indique le type de système de référence qui va être utilisé pour interpréter l’image en entrée. Le système de référence spécifié doit correspondre au système de référence utilisé pour entraîner le modèle d’apprentissage profond. Map space—The input image is in a map-based coordinate system. This is the default. Pixel space—The input image is in image space (rows and columns), with no rotation and no distortion.
Additional Input Raster	An additional input imagery source for image translation methods. This parameter is valid when the Metadata Format parameter is set to Classified Tiles, Export Tiles, or CycleGAN.

The exported training data can now be used in a deep learning model.

Rubriques connexes

Vous avez un commentaire à formuler concernant cette rubrique ?