Create Data Loading Workspace (Data Management)

Summary

Creates a Data Loading Workspace that can be used for data loading. The output workspace contains a collection of Microsoft Excel workbooks. These workbooks can be used to configure the source and target schema mapping.

Usage

  • The geometry of the source data will determine the types of Data Mapping folders that are generated. For example, if you only include point features, only a points folder will be generated. This tool supports tables and feature classes as inputs.

  • Every time the tool is run, a new workspace will be generated.

  • Create or specify a Mapping Table to use to match datasets, fields, and attribute domain coded value descriptions from a source and a target schema. The table is used to bidirectionally match substrings so order is not important. You can use the table to create matches or block them.

  • Predictive field matching uses a distance algorithm to match datasets, fields, and coded value descriptions between the source and the target schema.

  • This tool supports the following source and target data types:

    • Workspaces
      • File geodatabases
      • Mobile geodatabases
      • Enterprise geodatabases
      • Feature datasets
      • Feature services
      • CAD datasets (.dgn, .dwg, .dxf)
    • Tabular datasets
      • Feature classes
      • Tables
      • Feature service layers and tables
      • Shapefiles
      • .csv and delimited text files
      • Excel worksheets
      • CAD layers
      • dBase files

Parameters

LabelExplanationData Type
Source to Target Mapping

Defines how source data will be mapped to the target schema. Both workspaces and individual classes are supported as source or target inputs. When workspaces are used, name similarity is used to match objects in the source and target schema.

Value Table
Output Folder

The output folder where the Data Loading Workspace will be created.

Folder
Predictive Field Matching Options
(Optional)

Specifies whether field names or domain value descriptions will be matched.

  • Field Name SimilarityField names will be matched based on similarity between the source and target fields.
  • Domain Coded Value Description SimilarityAttribute domain value descriptions will be matched based on similarity between the source and target fields. When this option is specified, fields will not be matched by name if the source or target field has a domain.
String
Mapping Table
(Optional)

A table that will be used to perform substring matching for datasets, values, and attribute domain coded value descriptions. Use the table to create matches or block them.

Record Set
Calculate Row Count Statistics
(Optional)

Specifies whether the count and percentage of filled-in values will be calculated for fields in the source schema.

  • Checked—The count and percentage of filled-in values will be calculated.
  • Unchecked—No calculations will be performed on the field values. This is the default.
Boolean
Create Matches by Subtype
(Optional)

Specifies whether separate Data Mapping workbooks will be created by subtype if they exist.

  • Checked—Separate Data Mapping workbooks will be created for each match if they exist. The class name will not be used to match candidates if subtypes exist. This is the default.
  • Unchecked—Dataset matching will only be attempted at the class level. If classes contain subtypes, a subtype sheet will be created in the Data Mapping workbook.

Boolean

Derived Output

LabelExplanationData Type
Data Loading Workspace

The path to the Data Loading Workspace folder.

Workspace

arcpy.management.CreateDataLoadingWorkspace(source_target_mapping, out_folder, {match_options}, {mapping_table}, {calc_stats}, {match_subtypes})
NameExplanationData Type
source_target_mapping
[source_target_mapping,...]

Defines how source data will be mapped to the target schema. Both workspaces and individual classes are supported as source or target inputs. When workspaces are used, name similarity is used to match objects in the source and target schema.

Value Table
out_folder

The output folder where the Data Loading Workspace will be created.

Folder
match_options
[match_options,...]
(Optional)

Specifies whether field names or domain value descriptions will be matched.

  • MATCH_FIELDSField names will be matched based on similarity between the source and target fields.
  • MATCH_VALUESAttribute domain value descriptions will be matched based on similarity between the source and target fields. When this option is specified, fields will not be matched by name if the source or target field has a domain.
String
mapping_table
(Optional)

A table that will be used to perform substring matching for datasets, values, and attribute domain coded value descriptions. Use the table to create matches or block them.

Record Set
calc_stats
(Optional)

Specifies whether the count and percentage of filled-in values will be calculated for fields in the source schema.

  • CALC_STATSThe count and percentage of filled-in values will be calculated.
  • NO_STATSNo calculations will be performed on the field values. This is the default.
Boolean
match_subtypes
(Optional)

Specifies whether separate Data Mapping workbooks will be created by subtype if they exist.

  • MATCH_SUBTYPESSeparate Data Mapping workbooks will be created for each match if they exist. The class name will not be used to match candidates if subtypes exist. This is the default.
  • NO_MATCH_SUBTYPESDataset matching will only be attempted at the class level. If classes contain subtypes, a subtype sheet will be created in the Data Mapping workbook.
Boolean

Derived Output

NameExplanationData Type
out_loading_workspace

The path to the Data Loading Workspace folder.

Workspace

Code sample

CreateDataLoadingWorkspace example (Python window)

The following stand-alone script demonstrates how to use the CreateDataLoadingWorkspace function.

import arcpy

arcpy.management.CreateDataLoadingWorkspace(
    [["C:/data/WaterUtilities.gdb/wControlValue", "C:/data/Water_AssetPackage.gdb/WaterDevice"]],
    "C:/data",
    "MATCH_FIELDS;MATCH_VALUES",
    None,
    "CALC_STATS",
    "MATCH_SUBTYPES",
)
CreateDataLoadingWorkspace example (stand-alone script)

The following stand-alone script demonstrates how to use the CreateDataLoadingWorkspace function.

# Name: CreateDataLoadingWorkspace.py
# Description: Create a new Data Loading Workspace

# Import required modules
import os
import arcpy

# Source and target workspaces with the mapping of table name to table name.
source_workspace = "C:/data/WaterUtilities.gdb/WaterDistribution"
target_workspace = "C:/data/Water_AssetPackage.gdb/UtilityNetwork"
mapping = [
    ("wControlValve", "WaterDevice"),
    ("wHydrant", "WaterJunction"),
    ("wFitting", "WaterJunction"),
    ("wMain", "WaterLine"),
]

# Fully qualify the table names.
source_target = [(os.path.join(source_workspace, a), os.path.join(target_workspace, b)) for a, b in mapping]

# Set local variables.
output_folder = "C:/data"
mapping_table = "C:/temp/Default.gdb/DataReference_GenerateMappingTable"

arcpy.management.CreateDataLoadingWorkspace(
    source_target_mapping=source_target,
    out_folder=output_folder,
    match_options="MATCH_FIELDS;MATCH_VALUES",
    mapping_table=mapping_table,
    calc_stats=True,
    match_subtypes=True,
)

Environments

This tool does not use any geoprocessing environments.

Licensing information

  • Basic: Yes
  • Standard: Yes
  • Advanced: Yes

Related topics