Classify Text Using Deep Learning (GeoAI)—ArcGIS Pro

Summary

Runs a trained text classification model on a text field in a feature class or table and updates each record with an assigned class or category label with each class having a confidence value.

Learn more about how Text Classification works

Usage

This tool requires deep learning frameworks be installed. To set up your machine to use deep learning frameworks in ArcGIS Pro, see Install deep learning frameworks for ArcGIS.
This tool requires a model definition file containing trained model information. The model can be trained using the Train Text Classification Model tool. The Input Model Definition File parameter value can be an Esri model definition JSON file (.emd) or a deep learning model package (.dlpk). The model files can be stored locally or hosted on ArcGIS Living Atlas.
This tool can run on CPU or GPU. However, deep learning is computationally intensive and a GPU is recommended. To run this tool using GPU, set the Processor Type environment to GPU. If you have more than one GPU, specify the GPU ID environment instead
For information about requirements for running this tool and issues you may encounter, see Deep Learning frequently asked questions.

Parameters

Label	Explanation	Data Type
Input Table	The input point, line, or polygon feature class, or table, containing the text that will be classified and labelled.	Feature Layer; Table View
Text Field	A text field in the input feature class or table that contains the text that will be classified.	Field
Input Model Definition File	The trained model that will be used for classification. The model definition file can be an Esri model definition JSON file (.emd) or a deep learning model package (.dlpk) that is stored locally or hosted on ArcGIS Living Atlas (.dlpk_remote).	File
Class Label Field (Optional)	The name of the field that will contain the class or category label assigned by the model. The default field name is ClassLabel.	String
Model Arguments (Optional)	Additional arguments, such as sequence_length or confidence_threshold, that will be used to adjust the model's output. The names of the arguments will be populated by the tool. Note: The model argument confidence_threshold is only applicable for multilabel text classification.	Value Table
Get explanation for every prediction (Optional)	Specifies whether SHAP explanations will be generated. The time to generate an explanation will depend on the length of the input. Checked—A SHAP explanation will be generated for each row in the output table. Unchecked—No SHAP explanation will be generated. This is the default.	Boolean
Batch Size (Optional)	The number of training samples that will be processed at one time. The default value is 4. Increasing the batch size can improve tool performance; however, as the batch size increases, more memory is used. If an out of memory error occurs, use a smaller batch size.	Double

Derived Output

Label	Explanation	Data Type
Updated Table	The output point, line, or polygon feature class, or table, containing the classified and labelled text derived from the input data along with the confidence value for each class.	Table View; Feature Layer

arcpy.geoai.ClassifyTextUsingDeepLearning(in_table, text_field, in_model_definition_file, {class_label_field}, {model_arguments}, {explain}, {batch_size})

Name	Explanation	Data Type
in_table	The input point, line, or polygon feature class, or table, containing the text that will be classified and labelled.	Feature Layer; Table View
text_field	A text field in the input feature class or table that contains the text that will be classified.	Field
in_model_definition_file	The trained model that will be used for classification. The model definition file can be an Esri model definition JSON file (.emd) or a deep learning model package (.dlpk) that is stored locally or hosted on ArcGIS Living Atlas (.dlpk_remote).	File
class_label_field (Optional)	The name of the field that will contain the class or category label assigned by the model. The default field name is ClassLabel.	String
model_arguments [model_arguments,...] (Optional)	Additional arguments, such as sequence_length or confidence_threshold, that will be used to adjust the model's output. The names of the arguments will be populated by the tool. Note: The model argument confidence_threshold is only applicable for multilabel text classification.	Value Table
explain (Optional)	ENABLE_SHAP—A SHAP explanation will be generated for each row in the output table. DISABLE_SHAP—No SHAP explanation will be generated. This is the default.	Boolean
batch_size (Optional)	The number of training samples that will be processed at one time. The default value is 4. Increasing the batch size can improve tool performance; however, as the batch size increases, more memory is used. If an out of memory error occurs, use a smaller batch size.	Double

Derived Output

Name	Explanation	Data Type
updated_table	The output point, line, or polygon feature class, or table, containing the classified and labelled text derived from the input data along with the confidence value for each class.	Table View; Feature Layer

Code sample

ClassifyTextUsingDeepLearning (Python window)

The following Python window script demonstrates how to use the ClassifyTextUsingDeepLearning function.

# Name: ClassifyText.py
# Description: Classify text into multiple classes
#
# Requirements: ArcGIS Pro Advanced license

# Import system modules
import arcpy

arcpy.env.workspace = "C:/textanalysisexamples/data"

# Set local variables
in_table = "TextClassifierData"
pretrained_model_path_emd = "c:\\classifydata\\TextClassifier.emd"

# Run Classify Text Using Deep Learning
arcpy.geoai.ClassifyTextUsingDeepLearning(in_table, "Address", pretrained_model_path_emd)

Environments

Processor Type, GPU ID

Licensing information

Basic: No
Standard: No
Advanced: Yes

Summary

Usage

Parameters

Note:

Derived Output

Note:

Derived Output

Code sample

Environments

Licensing information

Related topics

In this topic