Data loading workspace concepts

The Create Data Loading Workspace tool creates a Data Loading workspace, a Data Reference workbook, and Mapping workbooks.

Data loading workspace

The Data Loading workspace contains folders, Microsoft Excel workbooks, and other items to assist in mapping data from a source to a target schema. The following describes the folder structure of a Data Loading workspace.

  • The Data Mapping folder contains the following folders, depending on the source data:
    • Points—Contains workbooks for each of the source point datasets that were mapped.
    • Polygons—Contains workbooks for each of the source polygon datasets that were mapped.
    • Polylines—Contains workbooks for each of the source polyline datasets that were mapped.
    • Table—Contains workbooks for each of the source table datasets that were mapped.
    • GlobalLookup—Contains the GlobalLookup.xlsx workbook that can be used for creating lookup tables to be used across all Mapping workbooks.
  • The Scripts folder contains the following script files you can use to process data using custom functions. These functions can be used in the Expression column of Mapping workbooks.
    • base.py—Use the basic functions from this file in the Mapping workbooks.
    • shape_operations.py—Use the shape manipulation functions from this file in the Mapping workbooks.
    • user.py—Use this file for custom functions.
  • The Domains folder contains workbooks with schema information for all domains in the source and target workspace.
  • The DataReference.xlsx file contains paths to the source, target, and Mapping workbooks, as well as the data loading settings.

Data Reference workbook

The Data Reference workbook includes paths to the source, target, and Mapping workbooks. If the location of any of these items changes, you can manually update the paths. Additional columns control how data can be loaded from source to target, including certain geoprocessing environment settings.

The following are the column definitions:

  • Source—The source dataset location.
  • SourceDefinitionQuery—Write an SQL query to select a subset of a source dataset to be mapped to the target dataset. For example, use Type = 0 to only select records where the attribute type equals 0 to load into the target dataset.
  • Target—The target dataset location.
  • TargetDeleteQuery—Write an SQL query to selectively delete data from the target dataset before loading. For example, 1=1 removes all existing data in the target dataset.
  • MappingWorkbook—The link to the location of the Data Mapping workbook in the Data Loading workspace.
  • Enabled—A True or False value specifying whether the Mapping workbook will be processed or skipped during the data loading process. The default is True.
  • MaintainAttachments—A True or False value specifying whether attachments will be maintained during the data loading process. The default is True.
  • PreserveGlobalIds— A True or False value specifying whether global IDs will be preserved during the data loading process. The default is False. Global IDs cannot be preserved when loading to a service.
  • GeographicTransformations—Define valid geographic transformations to be used when the data is loaded.

Data mapping workbooks

The Data Mapping workbooks are in the Data Mapping folder and are organized by geometry type. Each Data Mapping workbook corresponds to one of the source and target pairs defined in the Data Reference workbook. Use these workbooks to define how data will be mapped from the source to the target using the options described below.

Mapping worksheet

Use this worksheet to define field mapping. The worksheet is highlighted in green in the workbook.

The following are the column definitions:

  • TargetField—The fields from the target dataset.
  • FieldType—The field type of each field in the target dataset.
  • Expression—Directly map fields by selecting a field from the drop-down menu, or write a Python expression. For example, this can reference an external Python function or a constant value such as a number or string.
  • LookupSheet—Specify a Lookup worksheet. This must be used with the LookupKeys and LookupValue columns.
  • LookupKeys—Specify one or more columns from the Lookup worksheet to use as keys. To use multiple columns, separate them with a comma.
  • LookupValue—Specify a column from the Lookup worksheet to be used as a target value.
  • LookupDefault—When using a Lookup worksheet, this field can be used to define a default value. Null values and values that are not matched in the Lookup worksheet will be assigned this value. This column is optional and can be left blank if no default value is used.

Lookup worksheet

Use Lookup worksheets to define the source-to-target value mapping. They are used in the mapping worksheet as lookup keys and lookup values. A sheet is automatically created for all coded value domains in the source data, with columns containing domain codes and descriptions. Additional columns that represent fields in the source data can be added manually to support a multiple value lookup. These worksheets are highlighted in blue in the workbook.

If the Create Data Loading Workspace tool's Predictive Field Matching Options parameter is set to Domain Coded Value Description Similarity, a Lookup worksheet will be generated automatically between source and target domains. The source data columns will be added first, followed by any matching target data columns.

Schema worksheet

Use these worksheets to view schema information about the source and target datasets. These are used in building a Mapping workspace and are not used in processing. These worksheets are highlighted in yellow in the workbook.

Info worksheet

Use this worksheet to optionally define the source subtype value. This worksheet is highlighted in orange in the workbook.

Data mapping options

There are a number of options to map data in the Data Mapping workbooks. The Lookup or Expression columns can be used, but only one method can be used per target field. Review all options to determine what will work best for the data loading process. The options described below are configured in the Mapping worksheet.

Expression

The Expression column can be used to directly map fields with compatible data types. Some source and target fields are automatically matched if they have the same field names. Review any automatically matched fields to verify that the source and target data types match, and if coded domains are present, that the values are equivalent.

The Expression column can also be customized to use a Python expression written directly in Excel or to call a function in the Scripts folder. The following are scenarios in which this is the best option:

  • Converting a field value to a different data type
  • Mapping all source values to one target value
  • Creating a globally unique identifier (GUID)
  • Performing shape manipulation to convert between geometry types