Prepare data

You can use data engineering tools to clean and prepare your data. A subset of geoprocessing tools is available in the Data Engineering view to help you prepare your data for use in a map or an analysis. These tools are grouped into the following categories:

  • Clean—Clean the data. For example, you can remove unnecessary fields. You can also modify the fields or fill missing values.
  • Construct—Create fields that are derived from existing fields or properties of the layer. For example, you can add and calculate a new field; standardize, transform, or reclassify an existing field; and add a field based on the input layer’s geometry.
  • Integrate—Integrate or add data from another data source to the input table or feature class. For example, you can join fields or add fields by enriching the data.
  • Format—Change the format of the fields or reorganize the fields in the table or feature class. For example, you can convert time fields, encode categorical fields, or reduce the dimensions of existing fields.

Note:

Some geoprocessing tools in the Data Engineering view are not available for a noneditable layer. In this case, make an editable copy of the layer and open a new Data Engineering view.

You can access these groups and tools in the Data Engineering view by doing one of the following:

Data Engineering ribbon

When the Data Engineering view is active, a contextual ribbon appears at the top of the application. The ribbon provides access to commands and tools for exploring and preparing data.

Data Engineering ribbon

The Data group on the ribbon provides access to the fields view and attribute table of the layer associated with the active Data Engineering view. The Tools group offers four tool galleries: Clean, Construct, Integrate, and Format. Each tool gallery contains a subset of geoprocessing tools for the respective data engineering task. By default, the layer associated with the active Data Engineering view is used to automatically populate the input features parameter of these tools. In the Spatial group, Display XY Data and Geocode Table convert your non-spatial standalone tables to spatial data.

Data Engineering tools

The following tables describe all of the tools on the Data Engineering ribbon.

Note:

Some of the geoprocessing tools are not available for nonspatial data such as stand-alone tables.

Clean

The following tools are available in the Clean category:

ToolDescription

Delete Field

Deletes one or more fields from a table, feature class, feature layer, or raster dataset.

Append

Appends multiple input datasets into an existing target dataset. Input datasets can be feature classes, tables, shapefiles, rasters, or annotation or dimensions feature classes.

Alter Field

Renames fields and field aliases or alters field properties.

Project

Projects spatial data from one coordinate system to another.

Delete Rows

Deletes all or the selected subset of rows from the input.

Fill Missing Values

Replaces missing (null) values with estimated values based on spatial neighbors, space-time neighbors, time-series, or global statistic values.

Spatial Outlier Detection

Identifies global or local spatial outliers in point features.

Construct

The following tools are available in the Construct category:

ToolDescription

Calculate Field

Calculates the values of a field for a feature class, feature layer, or raster.

Add Field

Adds a new field to a table or the table of a feature class or feature layer, as well as to rasters with attribute tables.

Calculate Geometry Attributes

Adds information to a feature's attribute fields representing the spatial or geometric characteristics and location of each feature, such as length or area and x-, y-, z-coordinates, and m-values.

Transform Field

Transforms continuous values in one or more fields by applying mathematical functions to each value and changing the shape of the distribution. The transformation methods in the tool include log, square root, Box-Cox, multiplicative inverse, square, exponential, and inverse Box-Cox.

Standardize Field

Standardizes values in fields by converting them to values that follow a specified scale. Standardization methods include z-score, minimum-maximum, absolute maximum, and robust standardization.

Dimension Reduction

Reduces the number of dimensions of a set of continuous variables by aggregating the highest possible amount of variance into fewer components using Principal Component Analysis (PCA) or Reduced-Rank Linear Discriminant Analysis (LDA).

Time Series Smoothing

Smooths time series data, which helps account for short-term fluctuations to expose long-term trends and cycles. The tool can use the numeric variable of one or more time series using centered, forward, and backward moving averages, as well as an adaptive method based on local linear regression.

Integrate

The following tools are available in the Integrate category:

ToolDescription

Spatial Join

Joins attributes from one feature to another based on the spatial relationship. The target features and the joined attributes from the join features are written to the output feature class.

Join Field

Joins the contents of a table to another table based on a common attribute field. The input table is updated to contain the fields from the join table. You can select which fields from the join table will be added to the input table.

Near

Calculates distance and additional proximity information between the input features and the closest feature in another layer or feature class.

Summarize Within

Overlays a polygon layer with another layer to summarize the number of points, length of the lines, or area of the polygons within each polygon, and calculate attribute field statistics about the features within the polygons.

Summarize Nearby

Finds features that are within a specified distance of features in the input layer and calculates statistics for the nearby features.

Sample From Raster

Creates a table or a point feature class that shows the values of cells from a raster, or a set of rasters, for defined locations. The locations are defined by raster cells, points, polylines, or polygons.

Enrich

Enriches data by adding demographic and landscape facts about the people and places that surround or are inside data locations. The output is a duplicate of the input with additional attribute fields. This tool requires an ArcGIS Online organizational account or a locally installed Business Analyst dataset.

Apportion Polygon

Summarizes the attributes of an input polygon layer based on the spatial overlay of a target polygon layer and assigns the summarized attributes to the target polygons. The target polygons have summed numeric attributes that are derived from the input polygons that each target overlaps.

Format

The following tools are available in the Format category:

ToolDescription

Convert Time Field

Transfers date and time values stored in a field to another field. The tool can be used to convert between different field types (text, numeric, or date fields) or to convert the values to a different format such as dd/MM/yy HH:mm:ss to yyyy-MM-dd.

Convert Time Zone

Converts time values recorded in a date field from one time zone to another time zone.

Pivot Table

Creates a table from the input table by reducing redundancy in records and flattening one-to-many relationships.

Transpose Fields

Switch data stored in fields or columns to rows in a new table or feature class.

Reclassify Field

Reclassifies values in a numerical or text field into classes based on bounds defined manually or using a reclassification method.

Encode Field

Converts categorical values (string, integer, or date) into multiple numerical fields, each representing a category. The encoded numerical fields can be used in most data science and statistical workflows including regression models.

Note:

Most geoprocessing operations that modify the input data cannot be undone.

Related topics