Prepare data—ArcGIS Pro

You can use data engineering tools to clean and prepare your data. A subset of geoprocessing tools is available in the Data Engineering view to help you prepare your data for use in a map or an analysis. These tools are grouped into the following categories:

Clean—Clean the data. For example, you can remove unnecessary fields. You can also modify the fields or fill missing values.
Construct—Create fields that are derived from existing fields or properties of the layer. For example, you can add and calculate a new field; standardize, transform, or reclassify an existing field; and add a field based on the input layer’s geometry.
Integrate—Integrate or add data from another data source to the input table or feature class. For example, you can join fields or add fields by enriching the data.
Format—Change the format of the fields or reorganize the fields in the table or feature class. For example, you can convert time fields, encode categorical fields, or reduce the dimensions of existing fields.

Note:

Some geoprocessing tools in the Data Engineering view are not available for a noneditable layer. In this case, make an editable copy of the layer and open a new Data Engineering view.

You can access these groups and tools in the Data Engineering view by doing one of the following:

Right-click a context menu of a field in the fields panel.
Learn more about exploring fields
Right-click a context menu of a field in the statistics panel.
Learn more about interacting with statistics
Click the tool on the Data Engineering ribbon.

Data Engineering ribbon

When the Data Engineering view is active, a contextual ribbon appears at the top of the application. The ribbon provides access to commands and tools for exploring and preparing data.

Data Engineering ribbon

The Data group on the ribbon provides access to the fields view and attribute table of the layer associated with the active Data Engineering view. The Tools group offers four tool galleries: Clean, Construct, Integrate, and Format. Each tool gallery contains a subset of geoprocessing tools for the respective data engineering task. By default, the layer associated with the active Data Engineering view is used to automatically populate the input features parameter of these tools. In the Spatial group, Display XY Data and Geocode Table convert your non-spatial standalone tables to spatial data.

Data Engineering tools

The following tables describe all of the tools on the Data Engineering ribbon.

Note:

Some of the geoprocessing tools are not available for nonspatial data such as stand-alone tables.

Clean

The following tools are available in the Clean category:


Tool	Description
Delete Field	Deletes one or more fields from a table, feature class, feature layer, or raster dataset.
Alter Field	Renames fields and field aliases or alters field properties.
Project	Projects spatial data from one coordinate system to another.
Delete Rows	Deletes all or the selected subset of rows from the input.
Fill Missing Values	Replaces missing (null) values with estimated values based on spatial neighbors, space-time neighbors, time-series, or global statistic values.
Spatial Outlier Detection	Identifies global or local spatial outliers in point features.

Construct

The following tools are available in the Construct category:


Tool	Description
Calculate Field	Calculates the values of a field for a feature class, feature layer, or raster.
Add Field	Adds a new field to a table or the table of a feature class or feature layer, as well as to rasters with attribute tables.
Calculate Rates	Calculates crude or smoothed rates. The global empirical Bayes rate method smooths the rates toward a global reference rate. The local empirical Bayes, locally weighted average, and locally weighted median rate methods use local neighbors to spatially smooth rates.
Calculate Geometry Attributes	Adds information to a feature's attribute fields representing the spatial or geometric characteristics and location of each feature, such as length or area and x-, y-, z-coordinates, and m-values.
Transform Field	Transforms continuous values in one or more fields by applying mathematical functions to each value and changing the shape of the distribution. The transformation methods in the tool include log, square root, Box-Cox, multiplicative inverse, square, exponential, and inverse Box-Cox.
Standardize Field	Standardizes values in fields by converting them to values that follow a specified scale. Standardization methods include z-score, minimum-maximum, absolute maximum, and robust standardization.
Dimension Reduction	Reduces the number of dimensions of a set of continuous variables by aggregating the highest possible amount of variance into fewer components using Principal Component Analysis (PCA) or Reduced-Rank Linear Discriminant Analysis (LDA).
Time Series Smoothing	Smooths time series data, which helps account for short-term fluctuations to expose long-term trends and cycles. The tool can use the numeric variable of one or more time series using centered, forward, and backward moving averages, as well as an adaptive method based on local linear regression.

Integrate

The following tools are available in the Integrate category:


Tool	Description
Append	Appends to, or optionally updates, an existing target dataset with multiple input datasets. Input datasets can be feature classes, tables, shapefiles, rasters, or annotation or dimension feature classes.
Spatial Join	Joins attributes from one feature to another based on the spatial relationship. The target features and the joined attributes from the join features are written to the output feature class.
Join Field	Joins the contents of a table to another table based on a common attribute field. The input table is updated to contain the fields from the join table. You can select which fields from the join table will be added to the input table.
Near	Calculates distance and additional proximity information between the input features and the closest feature in another layer or feature class.
Summarize Within	Overlays a polygon layer with another layer to summarize the number of points, length of the lines, or area of the polygons within each polygon, and calculate attribute field statistics about the features within the polygons.
Summarize Nearby	Finds features that are within a specified distance of features in the input layer and calculates statistics for the nearby features.
Sample From Raster	Creates a table or a point feature class that shows the values of cells from a raster, or a set of rasters, for defined locations. The locations are defined by raster cells, points, polylines, or polygons.
Enrich	Enriches data by adding demographic and landscape facts about the people and places that surround or are inside data locations. The output is a duplicate of the input with additional attribute fields. This tool requires an ArcGIS Online organizational account or a locally installed Business Analyst dataset.
Apportion Polygon	Summarizes the attributes of an input polygon layer based on the spatial overlay of a target polygon layer and assigns the summarized attributes to the target polygons. The target polygons have summed numeric attributes that are derived from the input polygons that each target overlaps.

Format

The following tools are available in the Format category:


Tool	Description
Convert Temporal Field	Transfers temporal values stored in a field to another field. The tool can be used to convert between field types (text, numeric, or datetime fields) or to convert the values to a different format such as dd/MM/yy HH:mm:ss to yyyy-MM-dd.
Convert Time Zone	Converts time values recorded in a date field from one time zone to another time zone.
Pivot Table	Creates a table from the input table by reducing redundancy in records and flattening one-to-many relationships.
Transpose Fields	Switch data stored in fields or columns to rows in a new table or feature class.
Reclassify Field	Reclassifies values in a numerical or text field into classes based on bounds defined manually or using a reclassification method.
Encode Field	Converts categorical values (string, integer, or date) into multiple numerical fields, each representing a category. The encoded numerical fields can be used in most data science and statistical workflows including regression models.

Note:

Most geoprocessing operations that modify the input data cannot be undone.

Note:

Data Engineering ribbon

Data Engineering tools

Note:

Clean

Construct

Integrate

Format

Note:

Related topics

In this topic