Prepare data for replication

Data availability and performance can be improved when data is distributed across multiple geodatabases. When data is distributed, it helps alleviate server contention and allows organizations to balance the load on their geodatabases between users performing edits and those accessing it for read-only operations.

Geodatabase replication is one of the data distribution workflows available in ArcGIS Pro. Geodatabase replication is built on top of the traditional versioning environment and distributes all or parts of your data in a manner that allows any data changes to be synchronized across two or more geodatabases. When a dataset is replicated, a replica pair is created; one replica resides in the original geodatabase, and a related replica is distributed to a different geodatabase. Any changes made to these replicas in their respective geodatabases can be synchronized so that the data in one replica matches that in the related replica.

Prior to implementation, there are several aspects to consider to prepare your data for replication.

Geodatabase replication requirements

To be replicated, datasets must meet the following requirements:

  • The database user must have write access to the data.
  • The database user who will be creating the replica must have sufficient privileges to own data within the enterprise geodatabase.
  • All data must be registered with traditional versioning.
Note:

Branch versioned data and data registered with traditional versioning that uses the option to move edits to base are not supported with geodatabase replication in ArcGIS Pro.

Additional replication requirements apply based on the replication type:

  • Checkout/Check-in replication
    • You have the option of checking out nonversioned data or data that has been registered with traditional versioning.
  • One-way and two-way replicas
    • Each dataset must have a GlobalID column. This column is used to maintain row uniqueness across geodatabases.
  • One-way replication with the option to use archiving to track changes
    • One-way, parent-to-child replication—The child replica can be an enterprise or file geodatabase.
    • One-way, child-to-parent replication—Both child and parent replicas must be hosted in an enterprise geodatabase.

Any dataset not meeting these requirements will not be included in the replica. If all the datasets do not meet the requirements, replica creation will fail.

The list of data to replicate is automatically expanded to include dependent datasets. For example, all feature classes in a topology or feature dataset are included if any feature class in the topology or feature dataset is selected for replication. See Replication with advanced geodatabase datasets and Replication and geodatabase compatibility for more details.

The following is a list of the types of data for which additional rules and behaviors are applied when creating replicas. Review the topics that are appropriate for your data:

Determine datasets to replicate

One of the most important aspects of replica creation is the determination of what data should be replicated. When creating a replica, you can choose to replicate all the data in your datasets or only a subset of data. Plan on replicating an appropriate amount of data for your needs. Consider the lifetime of the replica and make sure your requirements are covered.

Metadata for the data you choose to replicate is copied during the replica creation process. However, changes to the metadata are not applied during replica synchronization.

Replicate all data

The Create Replica geoprocessing tool allows you to replicate all data by browsing to the layers to be replicated.

Note:

For tables, the default behavior is to replicate only the schema of the table. If you want to replicate all records, follow the steps below in Create a subset of data to be replicated to specify the SQL expression 1=1 as the definition query on the table. To replicate a subset of records, set the appropriate SQL expression.

Create a subset of data to be replicated

Sometimes, you may only want to replicate a subset of the features in the dataset. There are multiple ways to specify the subsets of data to be replicated:

  • Use definition queries.
  • Use a selection set.
  • Specify an extent.
  • Use geometry features.

Once data has been determined based on any filters used, relationship class logic is applied if relationship classes exist. For each dataset involved in a relationship class, additional rows are added if they are related to the data already in the replica. See Replication and related data for more information.

Use definition queries

Definition queries are written in SQL syntax and allow you to define a subset of features to work with in a layer by filtering which features are retrieved from the dataset and appear in the layer’s attribute table. To replicate a subset of features, follow these steps to first create a definition query for a layer in ArcGIS Pro.

Once you have created definition queries on your datasets, add them to the Create Replica geoprocessing tool using the Replica Datasets drop-down menu.

Create Replica geoprocessing tool displaying the drop-down option to select features in the map with definition queries applied

Note:

The Replica Datasets drop-down menu in the Create Replica geoprocessing tool must be used to ensure the definition queries are applied in the replica. Definition queries from data in the map are not honored when the browse button is used.

Use a selection set

Selecting features allows you to highlight a subset of features on your map to use in subsequent exploration or analysis of your data. After you have selected features, the selection sets of individual feature classes and tables can be replicated. From the Create Replica geoprocessing tool, use the Replica Datasets drop-down menu to ensure the selected datasets in the map are used within the replica.

Specify an extent

The Extent environment setting can be used to define the spatial extent of data to be replicated. This setting will only process and include features that pass through the extent specified.

The extent entered is assumed to be in the coordinate system in which the input data is stored, even if the Output Coordinate System environment is set. If the tool takes multiple input datasets, the first dataset defines the coordinate system of the extent.

Use geometry features

Review the Replica Geometry Features parameter in the Create Replica geoprocessing tool help for more information on how this can be used to define the replica geometry.

Create Replica geoprocessing tool parameters

The following describes the input parameters for the Create Replica geoprocessing tool.

Create Replica geoprocessing tool

  • Replica Datasets—To replicate a subset of data, apply definition queries and use the drop-down menu to add them to the replica. For other datasets, browse to and select the datasets or use the drop-down menu if the layers exist in your map.
  • Replica Type—Choose a checkout, one-way, one-way child-to-parent, or two-way replica.
  • Geodatabase to replicate data to—You can replicate to a local or remote destination geodatabase. Remote geodatabases can be accessed using geodata services running on ArcGIS Server.

    Browse to and select the geodatabase or geodata service to receive the data. If your replica type is checkout or one way, the destination can be a file geodatabase; otherwise. an enterprise geodatabase is required.

  • Replica Name—Enter the name for the replica to be created.

Advanced settings

Replica Access Type

When creating a replica, there are two options to choose from for a feature information model:

  • Full model—With the full model, all simple and complex data types, such as topologies, annotation, and dimension feature classes, are replicated to the child geodatabase and versioned. This is the default.

    The full model assumes features are stored with the same feature type in both the parent and child relative replica geodatabases. This option is for use with editing applications based on ArcGIS client software.

  • Simple model—In the simple model, the child geodatabase contains only simple features. The simple model does not replicate topologies, network datasets, annotation, and dimension feature classes, nor does it version data on the child geodatabase. If needed, you can version data on the child geodatabase after the replica is created.

    Nonsimple features in the parent geodatabase (for example, topologies and network datasets) are converted to simple features in the child geodatabase, and vice versa, during replication. Additional processing is applied during synchronization to account for the difference in feature types and characteristics in the two geodatabases.

    This option must be used when the child geodatabase is designed for editing by simple feature editors, including third-party editors that are not based on ArcGIS client software. It can also be used to simplify the data model on the child geodatabase for use during one-way replication or with editors based on ArcGIS or ArcGIS client software.

    The simple model provides the following benefits:

    • Allows you to edit the child geodatabase with simple feature editors that are not based on Esri software.
    • For one-way replicas, data on the child geodatabase is not versioned, which enables easy integration with non-Esri applications.
Note:

In the case of topology, when using the simple model, the topology object is excluded from the child geodatabase; however, all participating feature classes are included. Once you've created the replica, sending changes from the child to the parent geodatabase automatically maintains features on the parent. Additional processing occurs on the parent to update topology dirty areas.

Expand Feature Classes and Tables

This specifies whether you will include expanded feature classes and tables—such as those found in topologies or relationship classes—that were not listed in the replica datasets.

  • Use defaults—Adds the expanded feature classes and tables related to the feature classes and tables in the replica. The default for feature classes is to replicate all features intersecting the spatial filter. If no spatial filter has been provided, all features are included. The default for tables is to replicate the schema only.
  • Add with schema only—Adds only the schema for the expanded feature classes and tables.
  • All rows—Adds all rows for expanded feature classes and tables.
  • Do not add—Doesn't add expanded feature classes and tables.

Replicate Related Data

This specifies whether to replicate rows related to rows already in the replica. For example, consider a feature (f1) inside the replication filter and a related feature (f2) from another class outside the filter. Feature f2 is included in the replica if you choose to get related data.

  • Do not get related—Do not replicate related rows.
  • Get related—Replicate related data. This is the default.

Replica Geometry Features

Replica Geometry Features can be used to define the replica geometry.

  • The replica geometry features can be points, lines, or polygons.
  • A feature layer used for the replica geometry features can contain one or more features. If there are more than one, the geometries are merged, and only data that intersects the merged geometries will be replicated.
  • If filters (such as definition query) have been defined on the replica geometry features, only features that satisfy these filters will be used to define the replica geometry.
  • You can also use the Extent environment setting to define the replica geometry.
    • If Replica Geometry Features is set, it will be used as the replica geometry.
    • If Replica Geometry Features is not set, the Extent environment is used as the replica geometry.
    • If both Replica Geometry Features and the Extent environment are set, the Replica Geometry Features setting will be used.
    • If neither Replica Geometry Features nor the Extent environment is specified, the full extent of the data is used.

Example of data replication

The following example maintenance work orders illustrate some of the default behavior for data replication.

A maintenance crew is preparing to do some inspections in a residential area. To do some field editing, the crew needs to replicate that part of the infrastructure that covers this residential area. To start the replication process, the spatial extent of the inspection area is identified using a spatial filter (in this case, the extent is set via the environment setting).

Work area extent

The crew is to concentrate on cables that have been insulated with a particular material. To identify these cables, a query is applied to the relevant dataset.

Definition query applied to features in the work area extent

Finally, as each maintenance crew can expect to visit only a certain number of properties in a day, the homes in one residential block are identified by a definition query based on property numbers. This is shown as a selection below.

Selection set of homes impacted by the definition query in the work area

The selected features, features identified by a definition query, and features that intersect the chosen spatial extent will be replicated. Some additional features have also been included.

Data within the extent and definition query to be replicated

Related topics