Introduction to deep learning

Available with Image Analyst license.

Machine learning tools have been a core component of spatial analysis in GIS for decades. You have been able to use machine learning in ArcGIS to perform image classification, enrich data with clustering, or model spatial relationships. Machine learning is a branch of artificial intelligence in which structured data is processed with an algorithm to solve a problem. Traditional structured data requires a person to label the data, such as a pictures of cats and dogs, so that specific features for each animal type can be understood within the algorithm and used to identify these animals in other pictures.

Deep learning is a subset of machine learning that uses several layers of algorithms in the form of neural networks. Input data is analyzed through different layers of the network, with each layer defining specific features and patterns in the data. For example, if you wanted to identify features such as buildings and roads, the deep learning model would be trained with images of different buildings and roads, processing the images through layers within the neural network, then finding the identifiers required to classify a building or road.

Esri has developed tools and workflows to utilize the latest innovations in deep learning to answer some of the challenging questions in GIS and remote sensing applications. Computer vision, or the ability of computers to gain understanding from digital images or videos, is an area that has been shifting from the traditional machine learning algorithms to deep learning methods. Before applying deep learning to imagery in ArcGIS Pro, it is important to understand the different applications of deep learning for computer vision.

Applications of deep learning for computer vision

There are many computer vision tasks that can be accomplished with deep learning neural networks. Esri has developed tools that allow you to perform image classification, object detection, semantic segmentation, and instance segmentation. All of these computer vision tasks are described below, each with a remote sensing example and a more general computer vision example.

Image classification in computer vision

Image classification involves assigning a label or class to an entire digital image. For example, the drone image on the left above might be labelled crowd, and the digital photo on the right would be labelled cat. This type of classification is also known as object classification or image recognition, and it can be used in GIS to categorize features in an image. In ArcGIS Pro, this process can be accomplished using the Classify Objects Using Deep Learning tool.

Object detection in computer vision

Object detection is the process of locating features within an image. For example, in the remote sensing image above, the neural network found the location of an airplane. In a more general computer vision use case, a model may be able to detect the location of different animals. This process typically involves drawing a bounding box around the features of interest, and it can be used in GIS to locate specific features in satellite, aerial, or drone imagery and to plot those features on a map. In ArcGIS Pro, you can use the Detect Objects Using Deep Learning tool to detect objects in an image.

Semantic segmentation in computer vision

Semantic segmentation occurs when each pixel in an image is classified as belonging to a class. For example, in the image on the left above, road pixels are classified separately from nonroad pixels. On the right, pixels that make up a cat in a photo are classified as cat, while the other pixels in the image belong to other classes. In GIS, this is often referred to as pixel classification, image segmentation, or image classification, and it is often used to create land-use classification maps. In ArcGIS Pro, you can use the Classify Pixels Using Deep Learning tool for semantic segmentation.

Instance segmentation in computer vision

Instance segmentation is a more precise object detection method in which the boundary of each object instance is drawn. This type of deep learning application is also known as object segmentation.

Deep Learning in ArcGIS Pro

To find out more about how to perform deep learning in ArcGIS Pro, see Deep Learning in ArcGIS Pro.

For a list of the geoprocessing tools available for deep learning, see An Overview of the Deep Learning Toolset.

Related topics