Prepare data for replication

Available with Standard or Advanced license.

Data availability and performance can be improved when data is distributed across multiple geodatabases. When data is distributed, it helps alleviate server contention and allows organizations to balance the load on their geodatabases between users performing edits and those accessing it for read-only operations.

Geodatabase replication is one of the data distribution workflows available in ArcGIS Pro. You have two geoprocessing tools to select from to create a geodatabase replica:

Prior to implementation, there are several aspects to consider to prepare your data for replication.

Geodatabase replication requirements

To be replicated, datasets must meet the following requirements:

  • The source (parent) geodatabase must be an enterprise geodatabase.
  • The database user connecting to the parent geodatabase must have write access to the data.
  • All datasets in the replica must be from the same enterprise geodatabase.
  • The enterprise geodatabase connection must be configured for traditional versioning; the connection cannot be a branch version connection.
  • If the data is registered for traditional versioning, it cannot be versioned with the option to move edits to base.

Additional replication requirements apply based on the replication type:

  • Checkout/Check-in replication
    • You have the option of checking out nonversioned data or data that has been registered with traditional versioning.
  • One-way and two-way replicas
    • Each dataset must have a GlobalID column. This column is used to maintain row uniqueness across geodatabases.
  • One-way replication
    • One-way, parent-to-child replication—The child replica can be an enterprise or file geodatabase.
    • One-way, child-to-parent replication—Both child and parent replicas must be hosted in an enterprise geodatabase.
    • One-way replication with the option to use archiving to track replica changes—The parent replica version must be the default geodatabase version. The data must be enabled for archiving before creating the replica.

Any dataset not meeting these requirements will not be included in the replica. For additional details, see the Create Replica geoprocessing tool. If none of the datasets meet the requirements, replica creation will fail.

The list of data to replicate is automatically expanded to include dependent datasets. For example, all feature classes in a topology or feature dataset are included if any feature class in the topology or feature dataset is selected for replication. See the following for information about the types of data and geodatabase functionality for which additional rules and behaviors are applied when creating replicas:

Determine datasets to replicate

One of the most important aspects of replica creation is the determination of which data should be replicated. When creating a replica, you can choose to replicate all the data in your datasets or only a subset of data. Plan on replicating an appropriate amount of data for your needs. Consider the lifetime of the replica and make sure your requirements are covered.

Metadata for the data you choose to replicate is copied during the replica creation process. However, changes to the metadata are not applied during replica synchronization.

Replicate all data

The Create Replica geoprocessing tool allows you to replicate all the data in the layers that you include in the replica.

Note:

For nonspatial tables, the default behavior is to replicate only the schema of the table. To replicate all records for a specific table, follow the steps below in Create a subset of data to be replicated to specify the SQL expression 1=1 as the definition query on the table. To replicate all the records for all tables to the child geodatabase replica, use the All records for tables option, which can be found on the Create Replica geoprocessing tool under the Advanced Setting section. To replicate a subset of records, set the appropriate SQL expression.

Create a subset of data to be replicated

Sometimes, you may only want to replicate a subset of the features in the dataset. There are multiple ways to specify the subsets of data to be replicated:

  • Use definition queries.
  • Use a selection set.
  • Specify an extent.
  • Use geometry features.

Once data has been determined based on any filters used, relationship class logic is applied if relationship classes exist. For each dataset involved in a relationship class, additional rows are added if they are related to the data already in the replica. See Replication and related data for more information.

Use definition queries

Definition queries are written in SQL syntax and allow you to define a subset of features to work with in a layer by filtering which features are retrieved from the dataset and appear in the layer’s attribute table. To replicate a subset of features, follow these steps to first create a definition query for a layer in ArcGIS Pro.

Because definition queries are applied to the layers in the map and are not saved with the dataset in the geodatabase, you must either drag the layer from the map Contents pane to the Replica Datasets field in the Create Replica geoprocessing tool, or choose the layers from the Replica Datasets drop-down menu on the Create Replica geoprocessing tool.

Note:

Do not use the browse button to add the replica datasets. Definition queries from data in the map are not honored when the browse button is used.

Create Replica geoprocessing tool displaying the drop-down option to select features in the map with definition queries applied

Use a selection set

Selecting features allows you to highlight a subset of features on your map to use in subsequent exploration or analysis of your data. After you have selected features, the selection sets of individual feature classes and tables can be replicated. From the Create Replica geoprocessing tool, use the Replica Datasets drop-down menu to ensure the selected datasets in the map are used within the replica.

Specify an extent

The Extent environment setting can be used to define the spatial extent of data to be replicated. This setting will only process and include features that pass through the extent specified.

The extent entered is assumed to be in the coordinate system in which the input data is stored, even if the Output Coordinate System environment is set. If the tool takes multiple input datasets, the first dataset defines the coordinate system of the extent.

Use geometry features

You can specify a layer that contains one or more features, and any data that intersects the geometry or aggregate geometries in the layer will be included in the replica. See the explanation of the Replica Geometry Features parameter in the next section for more information about how this can be used to define the replica geometry.

Example of data replication

The following example for maintenance work orders illustrates some of the default behavior for data replication.

A maintenance crew is preparing to do some inspections in a residential area. To do some field editing, the crew needs to replicate that part of the infrastructure that covers this residential area. To start the replication process, the spatial extent of the inspection area is identified using a spatial filter (in this case, the extent is set via the environment setting).

Work area extent

The crew is to concentrate on cables that have been insulated with a particular material. To identify these cables, a query is applied to the relevant dataset.

Definition query applied to features in the work area extent

Finally, as each maintenance crew can expect to visit only a certain number of properties in a day, the homes in one residential block are identified by a definition query based on property numbers. This is shown as a selection below.

Selection set of homes impacted by the definition query in the work area

The selected features, features identified by a definition query, and features that intersect the chosen spatial extent will be replicated. Some additional features have also been included.

Data within the extent and definition query to be replicated

Related topics