Prepare data for replication

Available with Standard or Advanced license.

Data availability and performance can be improved when data is distributed across multiple geodatabases. When data is distributed, it helps alleviate server contention and allows organizations to balance the load on their geodatabases between users performing edits and those accessing it for read-only operations.

When creating a replica you now have two geoprocessing tools to select from:

Geodatabase replication is one of the data distribution workflows available in ArcGIS Pro. Geodatabase replication is built on top of the traditional versioning environment and distributes all or parts of your data in a manner that allows any data changes to be synchronized across two or more geodatabases. When a dataset is replicated, a replica pair is created; one replica resides in the original geodatabase, and a related replica is distributed to a different geodatabase. Any changes made to these replicas in their respective geodatabases can be synchronized so that the data in one replica matches that in the related replica.

Prior to implementation, there are several aspects to consider to prepare your data for replication.

Geodatabase replication requirements

To be replicated, datasets must meet the following requirements:

  • The database user must have write access to the data.
  • All datasets must be from the same enterprise geodatabase.
  • All data must be registered with traditional versioning; it cannot be versioned with the option to move edits to base.
Note:

Branch versioned data and data registered with traditional versioning that uses the option to move edits to base are not supported with geodatabase replication in ArcGIS Pro.

Additional replication requirements apply based on the replication type:

  • Checkout/Check-in replication
    • You have the option of checking out nonversioned data or data that has been registered with traditional versioning.
  • One-way and two-way replicas
    • Each dataset must have a GlobalID column. This column is used to maintain row uniqueness across geodatabases.
  • One-way replication
    • One-way, parent-to-child replication—The child replica can be an enterprise or file geodatabase.
    • One-way, child-to-parent replication—Both child and parent replicas must be hosted in an enterprise geodatabase.

  • One-way replication with the option to use archiving to track replica changes
    • The parent replica version must be the DEFAULT version.
    • The data must be enabled for archiving before creating the replica.

Any dataset not meeting these requirements will not be included in the replica, for additional details, see the Create Replica geoprocessing tool. If all the datasets do not meet the requirements, replica creation will fail.

The list of data to replicate is automatically expanded to include dependent datasets. For example, all feature classes in a topology or feature dataset are included if any feature class in the topology or feature dataset is selected for replication. See Replication with advanced geodatabase datasets and Replication and geodatabase compatibility for more details.

The following is a list of the types of data for which additional rules and behaviors are applied when creating replicas. Review the topics that are appropriate for your data:

Determine datasets to replicate

One of the most important aspects of replica creation is the determination of what data should be replicated. When creating a replica, you can choose to replicate all the data in your datasets or only a subset of data. Plan on replicating an appropriate amount of data for your needs. Consider the lifetime of the replica and make sure your requirements are covered.

Metadata for the data you choose to replicate is copied during the replica creation process. However, changes to the metadata are not applied during replica synchronization.

Replicate all data

The Create Replica geoprocessing tool allows you to replicate all data by browsing to the layers to be replicated.

Note:

For tables, the default behavior is to replicate only the schema of the table. If you want to replicate all records, follow the steps below in Create a subset of data to be replicated to specify the SQL expression 1=1 as the definition query on the table. To replicate a subset of records, set the appropriate SQL expression.

Create a subset of data to be replicated

Sometimes, you may only want to replicate a subset of the features in the dataset. There are multiple ways to specify the subsets of data to be replicated:

  • Use definition queries.
  • Use a selection set.
  • Specify an extent.
  • Use geometry features.

Once data has been determined based on any filters used, relationship class logic is applied if relationship classes exist. For each dataset involved in a relationship class, additional rows are added if they are related to the data already in the replica. See Replication and related data for more information.

Use definition queries

Definition queries are written in SQL syntax and allow you to define a subset of features to work with in a layer by filtering which features are retrieved from the dataset and appear in the layer’s attribute table. To replicate a subset of features, follow these steps to first create a definition query for a layer in ArcGIS Pro.

Once you have created definition queries on your datasets, add them to the Create Replica geoprocessing tool using the Replica Datasets drop-down menu.

Create Replica geoprocessing tool displaying the drop-down option to select features in the map with definition queries applied

Note:

The Replica Datasets drop-down menu in the Create Replica geoprocessing tool must be used to ensure the definition queries are applied in the replica. Definition queries from data in the map are not honored when the browse button is used.

Use a selection set

Selecting features allows you to highlight a subset of features on your map to use in subsequent exploration or analysis of your data. After you have selected features, the selection sets of individual feature classes and tables can be replicated. From the Create Replica geoprocessing tool, use the Replica Datasets drop-down menu to ensure the selected datasets in the map are used within the replica.

Specify an extent

The Extent environment setting can be used to define the spatial extent of data to be replicated. This setting will only process and include features that pass through the extent specified.

The extent entered is assumed to be in the coordinate system in which the input data is stored, even if the Output Coordinate System environment is set. If the tool takes multiple input datasets, the first dataset defines the coordinate system of the extent.

Use geometry features

Review the Replica Geometry Features parameter in the Create Replica geoprocessing tool help for more information on how this can be used to define the replica geometry.

Create Replica geoprocessing tool parameters

The following describes the input parameters for the Create Replica geoprocessing tool.

Create Replica geoprocessing tool

  • Replica Datasets—To replicate a subset of data, apply definition queries and use the drop-down menu to add them to the replica. For other datasets, browse to and select the datasets or use the drop-down menu if the layers exist in your map.
  • Replica Type—Choose a checkout, one-way, one-way child-to-parent, or two-way replica.
  • Output Type—The output type for the data that will be replicated.
    • Geodatabase—Replicates the data to a geodatabase. This is the default.
    • Xml file—Replicates the data to an XML workspace document.
  • Geodatabase to replicate data to—This parameter is required if Output Type is Geodatabase and replicates to a local or remote destination geodatabase. Remote geodatabases can be accessed using geodata services running on ArcGIS Server. Browse to and select the geodatabase or geodata service to receive the data. If your replica type is checkout or one way, the destination can be a file geodatabase; otherwise, an enterprise geodatabase is required. Using this option to replicate data to a geodatabase allows you to create a replica in a connected environment. See how to create a checkout replica, one-way replica, or two-way replica for detailed workflow steps.
  • XML file to replicate data to—This parameter is required if Output Type is Xml file and outputs an XML workspace document as an XML file. The XML file option supports disconnected environments where you can send the XML workspace document to the destination and import it to complete replica creation. See how to create a replica in a disconnected environment for detailed workflow steps.

    When using the Create Replica geoprocessing tool, Output Type can now be set to either Geodatabase or Xml file, which works well in disconnected environments.

  • Replica Name—Enter the name for the replica to be created.

Advanced settings

Replica Access Type

When creating a replica, there are two options to choose from for a feature information model:

  • Full model—With the full model, all simple and complex data types, such as topologies, annotation, and dimension feature classes, are replicated to the child geodatabase and versioned. This is the default.

    The full model assumes features are stored with the same feature type in both the parent and child relative replica geodatabases. This option is for use with editing applications based on ArcGIS client software.

  • Simple model—In the simple model, the child geodatabase contains only simple features. The simple model does not replicate topologies, network datasets, annotation, and dimension feature classes, nor does it version data on the child geodatabase. If needed, you can version data on the child geodatabase after the replica is created.

    Nonsimple features in the parent geodatabase (for example, topologies and network datasets) are converted to simple features in the child geodatabase, and vice versa, during replication. Additional processing is applied during synchronization to account for the difference in feature types and characteristics in the two geodatabases.

    This option must be used when the child geodatabase is designed for editing by simple feature editors, including third-party editors that are not based on ArcGIS client software. It can also be used to simplify the data model on the child geodatabase for use during one-way replication or with editors based on ArcGIS or ArcGIS client software.

    The simple model provides the following benefits:

    • Allows you to edit the child geodatabase with simple feature editors that are not based on Esri software.
    • For one-way replicas, data on the child geodatabase is not versioned, which enables easy integration with non-Esri applications.
Note:

In the case of topology, when using the simple model, the topology object is excluded from the child geodatabase; however, all participating feature classes are included. Once you've created the replica, sending changes from the child geodatabase to the parent geodatabase automatically maintains features on the parent. Additional processing occurs on the parent to update topology dirty areas.

Expand Feature Classes and Tables

This specifies whether you will include expanded feature classes and tables—such as those found in topologies or relationship classes—that were not listed in the replica datasets.

  • Use defaults—Adds the expanded feature classes and tables related to the feature classes and tables in the replica. The default for feature classes is to replicate all features intersecting the spatial filter. If no spatial filter has been provided, all features are included. The default for tables is to replicate the schema only.
  • Add with schema only—Adds only the schema for the expanded feature classes and tables.
  • All rows—Adds all rows for expanded feature classes and tables.
  • Do not add—Doesn't add expanded feature classes and tables.

Replicate Related Data

This specifies whether to replicate rows related to rows already in the replica. For example, consider a feature (f1) inside the replication filter and a related feature (f2) from another class outside the filter. Feature f2 is included in the replica if you choose to get related data.

  • Do not get related—Do not replicate related rows.
  • Get related—Replicate related data. This is the default.

Replica Geometry Features

Replica Geometry Features can be used to define the replica geometry.

  • The replica geometry features can be points, lines, or polygons.
  • A feature layer used for the replica geometry features can contain one or more features. If there are more than one, the geometries are merged, and only data that intersects the merged geometries will be replicated.
  • If filters (such as definition query) have been defined on the replica geometry features, only features that satisfy these filters will be used to define the replica geometry.
  • You can also use the Extent environment setting to define the replica geometry.
    • If Replica Geometry Features is set, it will be used as the replica geometry.
    • If Replica Geometry Features is not set, the Extent environment is used as the replica geometry.
    • If both Replica Geometry Features and the Extent environment are set, the Replica Geometry Features setting will be used.
    • If neither Replica Geometry Features nor the Extent environment is specified, the full extent of the data is used.

Register existing data only

Creating a replica is a process that involves copying data from the source geodatabase to a target geodatabase, and registers a replica in each geodatabase to describe the data that has been replicated. The replica creation process of copying data to another geodatabase and registering it can be time consuming on large datasets.

The option to Register existing data only when creating a replica is available for users with large datasets or those that have identical data in two different geodatabases. This creates the replica versions needed to synchronize changes between the geodatabases but does not go through the lengthy process of copying data since it already exists in both locations.

The option to Register existing data only can be found on the Create Replica geoprocessing tool under Advanced Setting.

Register existing data only option located on the
  • Checked—If Register existing data only is checked, it is assumed that the data already exists in the child geodatabase and will be used to register the replica.
  • Unchecked—If Register existing data only is left unchecked, which is the default, data in the parent geodatabase will be copied to the child geodatabase.

Requirements

Prior to using the Create Replica geoprocessing tool with the Register existing data only option, datasets in the child (target) geodatabase must meet the following requirements:

Note:

All of the following requirements must be met before using the Create Replica geoprocessing tool with the Register existing data only option. The geodatabase replication requirements, along with the requirements of having matching dataset names and that datasets are owned by the user connected to the child geodatabase are the only verifications made during the replica creation process. Once the replica has been created, if other requirements weren't previously met, errors will be encountered during attempts to synchronize the replica.

  • Meet the geodatabase replication requirements including the additional replication requirements that apply based on the replication type selected.
  • For one-way, child-to-parent replicas and two-way replicas, the data on the child replica must be registered as versioned.
  • Be owned by the user that is connected to the child geodatabase.
  • Have the same names as the datasets in the parent database.
  • Have the same schema, rules, relationships and properties as the datasets in the parent database
  • Have the same geometry types as the datasets in the parent database.

Tips

Here are some tips on using the Register existing data only option:

  • If Global IDs are a requirement for the replication type, you must make sure to add Global IDs to the data before you copy it to the other geodatabase. If you use functionality within ArcGIS to copy the data, make sure to use either copy and paste, or XML workspace export and import.
  • You must be connected as the owner of the data on the target geodatabase when creating the replica.
  • Whatever filters are applied during the replica creation process also are applied to the data in the relative geodatabase.

Limitations

It's important to be aware of the following limitations when using the Register existing data only option:

  • If the Register existing data only option is checked within the Create Replica geoprocessing tool, there is no option to select the matching dataset in the child geodatabase, therefore prior to checking this option, you must manually ensure that the datasets in the child geodatabase have been configured properly and meet all the geodatabase replication requirements.
  • When using the Register existing data only option, it is assumed that the data is identical in both geodatabases, so any differences that exist between the datasets in the parent and child replica at the time when the replica is created, will not be synchronized. If there are any layers missing in the target geodatabase, the Create Replica geoprocessing tool will fail and will return an error message.

Example of data replication

The following example maintenance work orders illustrate some of the default behavior for data replication.

A maintenance crew is preparing to do some inspections in a residential area. To do some field editing, the crew needs to replicate that part of the infrastructure that covers this residential area. To start the replication process, the spatial extent of the inspection area is identified using a spatial filter (in this case, the extent is set via the environment setting).

Work area extent

The crew is to concentrate on cables that have been insulated with a particular material. To identify these cables, a query is applied to the relevant dataset.

Definition query applied to features in the work area extent

Finally, as each maintenance crew can expect to visit only a certain number of properties in a day, the homes in one residential block are identified by a definition query based on property numbers. This is shown as a selection below.

Selection set of homes impacted by the definition query in the work area

The selected features, features identified by a definition query, and features that intersect the chosen spatial extent will be replicated. Some additional features have also been included.

Data within the extent and definition query to be replicated

Related topics