Prepare data for replication

Available with Standard or Advanced license.

Data availability and performance can be improved when data is distributed across multiple geodatabases. When data is distributed, it helps alleviate server contention and allows organizations to balance the load on their geodatabases between users performing edits and those accessing it for read-only operations.

Geodatabase replication is one of the data distribution workflows available in ArcGIS Pro. You have two geoprocessing tools to select from to create a geodatabase replica:

Prior to implementation, there are several aspects to consider to prepare your data for replication.

Geodatabase replication requirements

To be replicated, datasets must meet the following requirements:

  • The source (parent) geodatabase must be an enterprise geodatabase.
  • The database user connecting to the parent geodatabase must have write access to the data.
  • All datasets in the replica must be from the same enterprise geodatabase.
  • The enterprise geodatabase connection must be configured for traditional versioning; the connection cannot be a branch version connection.
  • If the data is registered for traditional versioning, it cannot be versioned with the option to move edits to base.

Additional replication requirements apply based on the replication type:

  • Checkout/Check-in replication
    • You have the option of checking out nonversioned data or data that has been registered with traditional versioning.
  • One-way and two-way replicas
    • Each dataset must have a GlobalID column. This column is used to maintain row uniqueness across geodatabases.
  • One-way replication
    • One-way, parent-to-child replication—The child replica can be an enterprise or file geodatabase.
    • One-way, child-to-parent replication—Both child and parent replicas must be hosted in an enterprise geodatabase.
    • One-way replication with the option to use archiving to track replica changes—The parent replica version must be the default geodatabase version. The data must be enabled for archiving before creating the replica.

Any dataset not meeting these requirements will not be included in the replica. For additional details, see the Create Replica geoprocessing tool. If none of the datasets meet the requirements, replica creation will fail.

The list of data to replicate is automatically expanded to include dependent datasets. For example, all feature classes in a topology or feature dataset are included if any feature class in the topology or feature dataset is selected for replication. See the following for information about the types of data and geodatabase functionality for which additional rules and behaviors are applied when creating replicas:

Determine datasets to replicate

One of the most important aspects of replica creation is the determination of which data should be replicated. When creating a replica, you can choose to replicate all the data in your datasets or only a subset of data. Plan on replicating an appropriate amount of data for your needs. Consider the lifetime of the replica and make sure your requirements are covered.

Metadata for the data you choose to replicate is copied during the replica creation process. However, changes to the metadata are not applied during replica synchronization.

Replicate all data

The Create Replica geoprocessing tool allows you to replicate all the data in the layers that you include in the replica.

Note:

For nonspatial tables, the default behavior is to replicate only the schema of the table. To replicate all records for a specific table, follow the steps below in Create a subset of data to be replicated to specify the SQL expression 1=1 as the definition query on the table. To replicate all the records for all tables to the child geodatabase replica, use the All records for tables option, which can be found on the Create Replica geoprocessing tool under the Advanced Setting section. To replicate a subset of records, set the appropriate SQL expression.

Create a subset of data to be replicated

Sometimes, you may only want to replicate a subset of the features in the dataset. There are multiple ways to specify the subsets of data to be replicated:

  • Use definition queries.
  • Use a selection set.
  • Specify an extent.
  • Use geometry features.

Once data has been determined based on any filters used, relationship class logic is applied if relationship classes exist. For each dataset involved in a relationship class, additional rows are added if they are related to the data already in the replica. See Replication and related data for more information.

Use definition queries

Definition queries are written in SQL syntax and allow you to define a subset of features to work with in a layer by filtering which features are retrieved from the dataset and appear in the layer’s attribute table. To replicate a subset of features, follow these steps to first create a definition query for a layer in ArcGIS Pro.

Because definition queries are applied to the layers in the map and are not saved with the dataset in the geodatabase, you must either drag the layer from the map Contents pane to the Replica Datasets field in the Create Replica geoprocessing tool, or choose the layers from the Replica Datasets drop-down menu on the Create Replica geoprocessing tool.

Note:

Do not use the browse button to add the replica datasets. Definition queries from data in the map are not honored when the browse button is used.

Create Replica geoprocessing tool displaying the drop-down option to select features in the map with definition queries applied

Use a selection set

Selecting features allows you to highlight a subset of features on your map to use in subsequent exploration or analysis of your data. After you have selected features, the selection sets of individual feature classes and tables can be replicated. From the Create Replica geoprocessing tool, use the Replica Datasets drop-down menu to ensure the selected datasets in the map are used within the replica.

Specify an extent

The Extent environment setting can be used to define the spatial extent of data to be replicated. This setting will only process and include features that pass through the extent specified.

The extent entered is assumed to be in the coordinate system in which the input data is stored, even if the Output Coordinate System environment is set. If the tool takes multiple input datasets, the first dataset defines the coordinate system of the extent.

Use geometry features

You can specify a layer that contains one or more features, and any data that intersects the geometry or aggregate geometries in the layer will be included in the replica. See the explanation of the Replica Geometry Features parameter in the next section for more information about how this can be used to define the replica geometry.

Create Replica geoprocessing tool parameters

The following describes the input parameters for the Create Replica geoprocessing tool.

Create Replica geoprocessing tool

  • Replica Datasets—To replicate a subset of data, apply definition queries and use the drop-down menu to add them to the replica. For other datasets, browse to and select the datasets or use the drop-down menu if the layers exist in your map.
  • Replica Type—Choose a checkout, one-way, one-way child-to-parent, or two-way replica.
  • Output Type—The output type for the data that will be replicated.
    • Geodatabase—Replicates the data to a geodatabase. This is the default.
    • Xml file—Replicates the data to an XML workspace document.
  • Geodatabase to replicate data to—This parameter is required if Output Type is Geodatabase and replicates to a local or remote destination geodatabase. Remote geodatabases can be accessed using geodata services running on an ArcGIS Server site. Browse to and select the geodatabase or geodata service to receive the data. If your replica type is checkout or one way, the destination can be a file geodatabase; otherwise, an enterprise geodatabase is required. Using this option to replicate data to a geodatabase allows you to create a replica in a connected environment. See how to create a checkout replica, one-way replica, or two-way replica for detailed workflow steps.
  • XML file to replicate data to—This parameter is required if Output Type is Xml file and outputs an XML workspace document as an XML file. The XML file option supports disconnected environments where you can send the XML workspace document to the destination and import it to complete replica creation. See how to create a replica in a disconnected environment for detailed workflow steps.

    When using the Create Replica geoprocessing tool, Output Type can now be set to either Geodatabase or Xml file, which works well in disconnected environments.

  • Replica Name—Enter the name for the replica to be created.

Advanced settings

The following sections describe the advanced settings for the Create Replica geoprocesssing tool.

Replica Access Type

When creating a replica, there are two options to choose from for a feature information model:

  • Full model—With the full model, all simple and complex data types, such as topologies, annotation, and dimension feature classes, are replicated to the child geodatabase and versioned. This is the default.

    The full model assumes features are stored with the same feature type in both the parent and child relative replica geodatabases. For example, if a feature class in the parent replica is a junction feature class in a network, the corresponding feature class in the child geodatabase must also be a junction feature class.

    This option is for use with editing applications based on ArcGIS client software.

  • Simple model—In the simple model, the child geodatabase contains only simple features. The simple model does not replicate topologies, network datasets, annotation, and dimension feature classes, nor does it version data on the child geodatabase. If needed, you can version data on the child geodatabase after the replica is created.

    Nonsimple features in the parent geodatabase (for example, parcel fabrics) are converted to simple features in the child geodatabase during replication. Additional processing is applied during synchronization to account for the difference in feature types and characteristics in the two geodatabases.

    This option must be used when the child geodatabase is designed for editing by simple feature editors, including third-party editors that are not based on ArcGIS client software. It can also be used to simplify the data model on the child geodatabase for use during one-way replication or with editors based on ArcGIS or ArcGIS client software.

    The simple model provides the following benefits:

    • Allows you to edit the child geodatabase with simple feature editors that are not based on Esri software.
    • For one-way replicas, data on the child geodatabase is not versioned, which enables easy integration with non-Esri applications.
Note:

In the case of topology, when using the simple model, the topology object is excluded from the child geodatabase; however, all participating feature classes are included. Once you create the replica, sending changes from the child geodatabase to the parent geodatabase automatically maintains features on the parent version. Additional processing occurs on the parent version to update topology dirty areas.

Expand Feature Classes and Tables

The options for this setting specify whether the replica will include the tables that are part of any extended dataset type—such as a topology, relationship class, or network—in which the feature classes or tables in the replica participate.

  • Use defaults—Adds the feature classes and tables that are part of associated extended dataset types. The default for feature classes is to replicate all features intersecting the spatial filter. If no spatial filter has been provided, all features are included. The default for tables is to replicate the schema only.
  • Add with schema only—Adds the schemas of the feature classes and tables in the extended datasets but not the data for them.
  • All rows—Adds all rows for the feature classes and tables in the extended datasets.
  • Do not add—Doesn't add the feature classes and tables from associated extended datasets.

Replicate Related Data

The options for this setting specify whether to replicate rows related to rows already in the replica. For example, consider a feature (f1) inside the replication filter and a related feature (f2) from another class outside the filter. Feature f2 is included in the replica if you choose to get related data.

  • Do not get related—Do not replicate related rows.
  • Get related—Replicate related data. This is the default.

Replica Geometry Features

The Replica Geometry Features option can be used to define the replica geometry.

  • The replica geometry features can be points, lines, or polygons.
  • A feature layer used for the replica geometry features can contain one or more features. If there are more than one, the geometries are merged, and only data that intersects the merged geometries will be replicated.
  • If filters (such as definition query) have been defined on the replica geometry features, only features that satisfy these filters will be used to define the replica geometry.
  • You can also use the Extent environment setting to define the replica geometry.
    • If Replica Geometry Features is set, it will be used as the replica geometry.
    • If Replica Geometry Features is not set, the Extent environment is used as the replica geometry.
    • If both Replica Geometry Features and the Extent environment are set, the Replica Geometry Features setting will be used.
    • If neither Replica Geometry Features is set nor the Extent environment is specified, the full extent of the data is used.

Register existing data only

Creating a replica is a process that involves copying data from the source geodatabase to a target geodatabase and registers a replica in each geodatabase to describe the data that has been replicated. The replica creation process of copying data to another geodatabase and registering it can be time consuming on large datasets.

The Register existing data only option when creating a replica is available for users with large datasets or those that have identical data in two different geodatabases. This creates the replica versions needed to synchronize changes between the geodatabases but does not go through the lengthy process of copying data since it already exists in both locations.

The Register existing data only option can be found on the Create Replica geoprocessing tool under Advanced Setting.

Register existing data only option located on the
  • Checked—If Register existing data only is checked, it is assumed that the data already exists in the child geodatabase and will be used to register the replica.
    Note:

    If the Register existing data only option is checked, then the All records for tables option will not be available.

  • Unchecked—If Register existing data only is left unchecked, which is the default, data in the parent geodatabase will be copied to the child geodatabase.

Note:

All of the following requirements must be met before using the Create Replica geoprocessing tool with the Register existing data only option. The geodatabase replication requirements, along with the requirements of having matching dataset names and that datasets are owned by the user connected to the child geodatabase are the only verifications made during the replica creation process. Once the replica has been created, if other requirements weren't previously met, errors will be encountered during attempts to synchronize the replica.

Prior to using the Create Replica geoprocessing tool with the Register existing data only option, datasets in the child (target) geodatabase must meet the following requirements:
  • Meet the geodatabase replication requirements including the additional replication requirements that apply based on the replication type selected.
  • For one-way, child-to-parent replicas and two-way replicas, the data on the child replica must be registered as versioned.
  • Be owned by the user that is connected to the child geodatabase.
  • Have the same names as the datasets in the parent database.
  • Have the same schema, rules, relationships, and properties as the datasets in the parent database
  • Have the same geometry types as the datasets in the parent database.

Tips

Keep the following in mind when using the Register existing data only option:

  • If Global IDs are a requirement for the replication type, you must make sure to add Global IDs to the data before you copy it to the other geodatabase. If you use functionality within ArcGIS to copy the data, make sure to use either copy and paste, or XML workspace export and import.
  • You must be connected as the owner of the data on the target geodatabase when creating the replica.
  • Whatever filters are applied during the replica creation process also are applied to the data in the relative geodatabase.

Limitations

It's important to be aware of the following limitations when using the Register existing data only option:

  • If the Register existing data only option is checked within the Create Replica geoprocessing tool, there is no option to select the matching dataset in the child geodatabase, therefore prior to checking this option, you must manually ensure that the datasets in the child geodatabase have been configured properly and meet all the geodatabase replication requirements.
  • When using the Register existing data only option, it is assumed that the data is identical in both geodatabases, so any differences that exist between the datasets in the parent and child replica at the time when the replica is created, will not be synchronized. If there are any layers missing in the target geodatabase, the Create Replica geoprocessing tool will fail and will return an error message.

All records for tables

During the replica creation process, the data and schema of the datasets being replicated is copied from the source geodatabase to a target geodatabase and a replica is created in each geodatabase. The data is defined as the rows and columns in the table and the schema consists of the fields, domains, subtypes, and other properties that describe the replicated data.

For datasets, the default behavior is to replicate both the data and schema. For tables, the default behavior is to replicate only the schema of the table.

All records for tables can be used to specify whether all records or only the schema will be copied to the child geodatabase for tables that do not have filters applied (such as selections or definition queries).

The All records for tables option can be found on the Create Replica geoprocessing tool under Advanced Setting.

All records for tables option located on the

  • Checked—If All records for tables is checked, all records will be copied to the child geodatabase replica for tables with no applied filters. This option will override the Expand Feature Classes and Tables parameter value.
    Note:

    The All records for tables option will not be available if the Register existing data only option is checked.

  • Unchecked—If All records for tables is left unchecked, only the schema will be copied to the child geodatabase for tables with no applied filters. Tables with applied filters will be honored. This is the default.

Example of data replication

The following example maintenance work orders illustrate some of the default behavior for data replication.

A maintenance crew is preparing to do some inspections in a residential area. To do some field editing, the crew needs to replicate that part of the infrastructure that covers this residential area. To start the replication process, the spatial extent of the inspection area is identified using a spatial filter (in this case, the extent is set via the environment setting).

Work area extent

The crew is to concentrate on cables that have been insulated with a particular material. To identify these cables, a query is applied to the relevant dataset.

Definition query applied to features in the work area extent

Finally, as each maintenance crew can expect to visit only a certain number of properties in a day, the homes in one residential block are identified by a definition query based on property numbers. This is shown as a selection below.

Selection set of homes impacted by the definition query in the work area

The selected features, features identified by a definition query, and features that intersect the chosen spatial extent will be replicated. Some additional features have also been included.

Data within the extent and definition query to be replicated

Related topics