Updating and fixing data sources

There are numerous reasons why data sources need to be repaired or redirected to different locations. The Catalog View in ArcGIS Pro has capabilities for updating data sources. However, making these changes manually in every affected map or project can be overwhelming. Methods are available with the arcpy.mp scripting environment that make it possible to automate these changes without having to open a project.

The following members are used with changing data source workflows:

Using the updateConnectionProperties function

The two required parameters for the updateConnectionProperties function are as follows:

  • current_connection_info—A string that represents the workspace path or a Python dictionary that contains connection properties to the source you want to update.
  • new_connection_info—A string that represents the workspace path or a Python dictionary that contains connection properties with the new source information.

The updateConnectionProperties function can be thought of as a find-and-replace function with which you replace the current_connection_info parameter with the new_connection_info parameter. These parameters can be either a full path to a workspace, a partial string, a dictionary that contains connection properties, a partial dictionary that defines specific keys, or a path to a database connection (.sde) file. If an empty string or None is used in current_connection_info, all connection properties will be replaced with the new_workspace_info, depending on the value of the validate parameter.

Sugerencia:

Do not include the names of geodatabase feature datasets in the current_connection_info or the new_connection_info parameters. Feature datasets are part of the workspace.

The auto_update_joins_and_relates parameter allows you to control whether joins and relates associated with a layer or table should be updated. The default is set to True. There may be times, especially when updating all data sources at the project level, that you do not want these associated sources to be updated. If that is the case, set this parameter to False.

By default, the updateConnectionProperties method only updates a data source if the new_connection_info is a valid data source. If the validate parameter is set to False, the data source is set to that location regardless of whether it exists. This can be useful for scenarios that require data sources to be updated ahead of the data being created. In these cases, the data appears broken in the associated maps.

If no matches are found when you replace the current_connection_info parameter with the new_connection_info parameter in the updateConnectionProperties function, your script may complete, but nothing will be updated.

To change a layer's dataset to a feature class with a different name, see the Changing a layer's dataset section below.

Common updateConnectionProperties examples

  1. The following script changes the full path to a file geodatabase data source for all layers and tables in a project. In this example, a folder was renamed and all vector data was moved to this new location:

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    aprx.updateConnectionProperties(r'C:\Projects\YosemiteNP\Historic_Data\Yosemite.gdb',
                                    r'C:\Projects\YosemiteNP\New_Data\Yosemite.gdb')
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  2. The following example is very similar to the one above but uses partial path strings to update the folder that the data resides in. In this sample, all occurrences of Historic_Data in the workspace path will be replaced with New_Data. Be sure when using a partial string that it doesn't occur multiple times in a path. You may not get the results you would expect. Alternatively, you can use full path strings instead of partial path strings.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    aprx.updateConnectionProperties(current_connection_info='Historic_Data',
                                    new_connection_info='New_Data')
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  3. The following example replaces a personal geodatabase connection with a file geodatabase connection using a partial path for all layers and tables in a map:

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    m = aprx.listMaps("Yose*")[0]
    m.updateConnectionProperties(current_connection_info='Background.mdb',
                                 new_connection_info='Background_fGDB.gdb')
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  4. The following example references a layer in a map and uses those connection properties to update the connection properties for the same layer in a layer file that has not been updated with the new data source:

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    m = aprx.listMaps('Yose*')[0]
    lyr = m.listLayers('Ranger Stations')[0]
    lyrFile = arcpy.mp.LayerFile(r'C:\Projects\YosemiteNP\LYRXs\Yosemite\OperationalLayers.lyrx')
    
    for l in lyrFile.listLayers():
      if l.name == 'Ranger Stations':
        l.updateConnectionProperties(l.connectionProperties, lyr.connectionProperties)
    
    lyrFile.save()
  5. In this example, a feature class is moved to a new file geodatabase. The script searches all the broken layers in a project that referenced the feature class. It then updates the layer's data source to the new file geodatabase.

    import arcpy, os
    
    featureClass = 'Boundary'
    oldFileGDB = r'C:\Projects\YosemiteNP\Yosemite.gdb'
    newFileGDB = r'C:\Projects\YosemiteNP\YosemiteNew.gdb'
    
    # Reference project
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    
    # Search for specific broken layers
    for m in aprx.listMaps():
        for lyr in m.listLayers():
            if lyr.isBroken:
                if lyr.supports("DATASOURCE"):
                    if lyr.dataSource == os.path.join(oldFileGDB, featureClass):
                        lyr.updateConnectionProperties(oldFileGDB, newFileGDB)
    
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  6. The following example directs all data sources to a single file geodatabase by specifying current_connection_info to be None. This can be useful in situations in which multiple data sources have been consolidated into a single workspace.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    aprx.updateConnectionProperties(None, r'C:\Projects\YosemiteNP\New_Data\Yosemite.gdb')
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  7. In this example, a layer's URL needs to be updated. The script searches all the layers in a project that referenced the old URL. It then updates the layer's data source to the new URL.

    import arcpy
    
    featureClass = 'Boundary'
    oldURL = 'https://services2.arcgis.com/k4wsDILUIGeQ5HvW/arcgis/rest/services/USAFederalLands/FeatureServer'
    newURL = 'https://services2.arcgis.com/k4wsDILUIGeQ5HvW/arcgis/rest/services/USAFederalLandsUpdated/FeatureServer'
    
    # Reference project
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    
    # Search all layers for the old URL
    for m in aprx.listMaps():
        for lyr in m.listLayers():
            if lyr.supports("DATASOURCE"):
                if oldURL in lyr.dataSource:
                    lyr.updateConnectionProperties(oldURL, newURL)
    
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")

Examples of updating the data sources for enterprise geodatabase layers

  1. The following example replaces a file geodatabase connection with a path to an enterprise geodatabase connection (.sde) file for all layers and tables in a project:

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    aprx.updateConnectionProperties(r'C:\Projects\YosemiteNP\Vector_Data\Yosemite.gdb',
                                    r'C:\Projects\YosemiteNP\DBConnections\Server.sde')
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  2. The following example replaces the connection properties from an enterprise geodatabase connection file in the current_connection_info parameter with a new enterprise geodatabase connection file in the new_connection_info parameter:

    Nota:

    The enterprise geodatabase connection file specified in the current_connection_info parameter does not need to be the actual connection file used to create the layer. Rather, the connection properties contained within the connection file will be used in the updateConnectionProperties find-and-replace functionality.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    aprx.updateConnectionProperties(r'C:\Projects\YosemiteNP\DBConnections\TestGDB.sde',
                                    r'C:\Projects\YosemiteNP\DBConnections\ProductionGDB.sde')
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  3. The following example replaces an enterprise geodatabase connection file with a path to a file geodatabase for all layers and tables in a project:

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    aprx.updateConnectionProperties(r'C:\Projects\YosemiteNP\DBConnections\Server.sde',
                                    r'C:\Projects\YosemiteNP\Local_Data\YosemiteLocal.gdb')
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  4. The following example directs all data sources to a single enterprise geodatabase by specifying current_connection_info to be None. This can be useful in situations in which the credentials of the enterprise geodatabase layers in the project are unknown or unavailable.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    aprx.updateConnectionProperties(None, r'C:\Projects\YosemiteNP\DBConnections\Server.sde')
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")

Using the connectionProperties dictionary

Using connectionProperties for updating data sources requires that you work with a dictionary of connection properties. The dictionary that is returned varies depending on whether it is a file-based workspace or a database connection, or whether the layer or table has associated joins or relates. It is because of this variability that it is important to understand the different types of connection properties and how to navigate the dictionaries to make the appropriate changes. For example, a layer with a join returns a different result than the same layer without a join. The approach to updating connection property dictionaries is to reference and retrieve the dictionary from a layer or table, make the necessary changes to it, then set the modified dictionary back to the layer or table you want to update using the updateConnectionProperties method.

A good way to display the dictionary structure is to use the Python pprint function.

import arcpy, pprint
p = arcpy.mp.ArcGISProject('current')
m = p.listMaps()[0]
l = m.listLayers()[0]
pprint.pprint(l.connectionProperties)

For example, a file-based data source with no joins or relates will look like the following:

{'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\Yosemite.gdb'}, 
 'dataset': 'RangerStations', 
 'workspace_factory': 'File Geodatabase'}

The above example is the most basic structure. A dictionary with three keys is returned. The value for the connection_info key is another dictionary that contains a path to the database.

Here are several examples of using the connectionProperties dictionary:

  1. The following example updates the data source's dataset name from RangerStations to RangerStationsNew. It also updates the geodatabase from Yosemite.gdb to YosemiteNew.gdb.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    lyr = aprx.listMaps("Main*")[0].listLayers("Ranger Stations")[0]
    find_dict = {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\Yosemite.gdb'}, 
                 'dataset': 'RangerStations', 
                 'workspace_factory': 'File Geodatabase'}
    replace_dict = {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\YosemiteNew.gdb'}, 
                    'dataset': 'RangerStationsNew', 
                    'workspace_factory': 'File Geodatabase'}
    lyr.updateConnectionProperties(find_dict, replace_dict)
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  2. The script above can also be rewritten such as the following:

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    lyr = aprx.listMaps("Main*")[0].listLayers("Ranger Stations")[0]
    cp = lyr.connectionProperties
    cp['connection_info']['database'] = 'C:\\Projects\\YosemiteNP\\Data\\YosemiteNew.gdb'
    cp['dataset'] = 'RangerStationsNew'
    lyr.updateConnectionProperties(lyr.connectionProperties, cp)
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  3. The connectionProperties dictionary can also be used to update file-based data sources, such as shapefiles, raster files, and so on. The following example changes the data source of a layer to point to a new shapefile in a different folder.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    lyr = aprx.listMaps('Main*').listLayers('RoadsShp')[0]
    cp = lyr.connectionProperties
    cp['connection_info']['database'] = 'C:\\Projects\\YosemiteNP\\Data_New'
    cp['dataset'] = 'NewRoads.shp'
    lyr.updateConnectionProperties(lyr.connectionProperties, cp)
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")

Using the connectionProperties dictionary with enterprise geodatabase data

Below is an example of an enterprise geodatabase data source connectionProperties dictionary for a layer:

{'connection_info': {'authentication_mode': 'OSA',                     
                     'database': 'TestDB',                     
                     'db_connection_properties': 'TestServer',                     
                     'dbclient': 'sqlserver',                     
                     'instance': 'sde:sqlserver:TestServer',                     
                     'password': '*********',                     
                     'server': 'TestServer',                     
                     'user': 'User',                     
                     'version': 'sde.DEFAULT'}, 
'dataset': 'TestDB.USER.RangerStations', 
'workspace_factory': 'SDE'}

The same three keys are returned as a file geodatabase layer, but this time the connection_info value is a dictionary with a larger set of database connection properties. Any of these properties can be modified.

The following example changes the enterprise geodatabase instance and server for all layers in a project. In this example, the enterprise geodatabase uses operating system authentication, and the database name is the same. If the usernames and passwords are the same, the instance and server can be changed without knowing the credentials of layers in the project and without having to create new enterprise geodatabase connection files.

import arcpy
aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
find_dict = {'connection_info': {'db_connection_properties': 'TestServer',
                                 'instance': 'sde:sqlserver:TestServer',
                                 'server': 'TestServer'}}
replace_dict = {'connection_info': {'db_connection_properties': 'ProdServer',
                                    'instance': 'sde:sqlserver:ProdServer',
                                    'server': 'ProdServer'}}
aprx.updateConnectionProperties(find_dict, replace_dict)
aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")

Using the connectionProperties dictionary with joins

The connectionProperties dictionary will also show the properties of any joins that are present on the layer. Any of these properties can be modified.

The example below shows a file-based data source connectionProperties dictionary with one join:

{'cardinality': 'one_to_many',
 'destination': {'connection_info': {'database': 'C:\\Projects\\FGDB.gdb'},
                 'dataset': 'tabular_eco',
                 'workspace_factory': 'File Geodatabase'},
 'foreign_key': 'ECO_CODE',
 'join_forward': False,
 'join_type': 'left_outer_join',
 'primary_key': 'CODE',
 'source': {'connection_info': {'database': 'C:\\Projects\\FGDB.gdb'},
            'dataset': 'mex_eco',
            'workspace_factory': 'File Geodatabase'}}

This example shows a file-based data source connectionProperties dictionary with two joins:

{'cardinality': 'one_to_many', 
 'destination': {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\BackgroundData.gdb'},                 
                 'dataset': 'census2000',                 
                 'workspace_factory': 'File Geodatabase'}, 
 'foreign_key': 'State_Polygons.State_Name', 
 'join_forward': False, 
 'join_type': 'left_outer_join',
 'primary_key': 'STATE_NAME', 
 'source': {'cardinality': 'one_to_many',
            'destination': {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\BackgroundData.gdb'},
                            'dataset': 'census2010',                            
                            'workspace_factory': 'File Geodatabase'},
            'foreign_key': 'State_Name',            
            'join_forward': False,            
            'join_type': 'left_outer_join',            
            'primary_key': 'STATE_NAME',            
            'source': {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\BackgroundData.gdb'},
                       'dataset': 'State_Polygons',                       
                       'workspace_factory': 'File Geodatabase'}}}

When joins are associated with a layer or table, the connectionProperties dictionary structure changes. You no longer have the same three root level keys, as you saw in previous examples. To understand why this is different, you need to understand how joins are persisted. Joins are nested. For example, if table one and table two are joined to a layer, table one is joined to the layer and table two is joined to the combination of the layer and table one. The root level dictionary describes the second join first. From the second join's source, you can trace the connection to the original layer and table one.

Here are several examples of using the connectionProperties dictionary with joins:

  1. The following example modifies the foreign key of a join for a specific layer:

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\Mexico\MexicoEcology.aprx')
    mexLyr = aprx.listMaps('Layers')[0].listLayers('mex_eco')[0]
    conProps = mexLyr.connectionProperties
    conProps['foreign_key'] = 'ECO_CODE_NEW'
    mexLyr.updateConnectionProperties(mexLyr.connectionProperties, conProps)
    aprx.saveACopy(r"C:\Projects\Mexico\MexicoEcologyNew.aprx")
  2. A partial dictionary can also be used in the updateConnectionProperties method. The following example modifies the join properties for all layers in the project that use the specified foreign key:

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\Mexico\MexicoEcology.aprx')
    aprx.updateConnectionProperties({'foreign_key': 'ECO_CODE'}, {'foreign_key': 'ECO_CODE_NEW'})
    aprx.saveACopy(r"C:\Projects\Mexico\MexicoEcologyNew.aprx")
  3. The following example modifies the source database and dataset for the primary layer both tables are joined to without changing the connection information for the joins:

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    lyr = aprx.listMaps("Main*")[0].listLayers("State_Polygons")[0]
    conProp = lyr.connectionProperties
    conProp['source']['source']['connection_info']['database'] = 'C:\\Projects\\YosemiteNP\\Vector_Data\\Census.gdb'
    conProp['source']['source']['dataset'] = 'States'
    lyr.updateConnectionProperties(lyr.connectionProperties, conProp)
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  4. The following example will create a join inventory of all the joins on all the layers in a map. This example employs Python recursive function logic to handle layers that have no joins or any number of joins:

    import arcpy
    
    def ListJoinsConProp(cp, join_count=0):
        if 'source' in cp:
            if 'destination' in cp:
                print(' '*6, 'Join Properties:')
                print(' '*9, cp['destination']['connection_info'])
                print(' '*9, cp['destination']['dataset'])
                join_count += 1
                return ListJoinsConProp(cp['source'], join_count)
        else:
            if join_count == 0:
                print(' '*6, '- no join')
    
    aprx = arcpy.mp.ArcGISProject(r"C:\Projects\Mexico\MexicoEcology.aprx")
    m = aprx.listMaps()[0]
    for lyr in m.listLayers():
        print(f"LAYER: {lyr.name}")
        if lyr.supports("dataSource"):
            cp = lyr.connectionProperties
            if cp is not None:
                ListJoinsConProp(cp)
  5. The following example will display the connection properties of layers in the map. Similar to the example above, this example also employs Python recursive function logic to handle layers that have no joins or any number of joins:

    import arcpy
    
    def ConPropsWithJoins(cp):
        if 'source' in cp:
            return ConPropsWithJoins(cp['source'])
        else:
            print(' '*6, 'database:', cp['connection_info']['database'])
            print(' '*6, 'dataset:', cp['dataset'])
            print(' '*6, 'workspace_factory:', cp['workspace_factory'])
    
    aprx = arcpy.mp.ArcGISProject(r"C:\Projects\Mexico\MexicoEcology.aprx")
    m = aprx.listMaps()[0]
    for lyr in m.listLayers():
        print(f"LAYER: {lyr.name}")
        if lyr.supports("dataSource"):
            cp = lyr.connectionProperties
            if cp is not None:
                ConPropsWithJoins(cp)

Updating data sources using the CIM

Starting with ArcGIS Pro 2.4, Python developers have fine-grained access to the Cartographic Information Model (CIM) and can access many more settings, properties, and capabilities that are persisted in a project or document. This can be useful in updating data source workflows. For more information, see the following:

If a specific data source workflow is difficult to accomplish using the updateConnectionProperties function, modifying a layer's CIM structure is an option. The Python CIM Access topic describes the JSON structure of the CIM object model. Understanding this structure will allow you to update a layer's CIM.

For example, the following is a JSON representation of a file geodatabase layer's data source. The JSON below is not the full CIM structure of the layer. Rather, it is a snippet showing only the dataConnection node.

"dataConnection" : {
  "type" : "CIMFeatureDatasetDataConnection",
  "workspaceConnectionString" : "DATABASE=C:\\Projects\\YosemiteNP\\Yosemite.gdb",
  "workspaceFactory" : "FileGDB",
  "dataset" : "Parcels",
  "datasetType" : "esriDTFeatureClass"
}

Below are some examples of using the CIM to update data sources:

  1. This script changes the data source of a relate on a layer.

    import arcpy
    
    # Specify the new relate properties
    newGDB = "FGDB2.gdb"
    newFeatureClass = "Cities2"
    newRelateName = "New Relate"
    
    # Reference project, map and layer 
    p = arcpy.mp.ArcGISProject(r'C:\Projects\USA.aprx')
    m = p.listMaps('Relate Map')[0]
    l = m.listLayers('States')[0]
    
    # Get the layer's CIM definition
    lyrCIM = l.getDefinition('V3')         
    
    # Get the first relate on the layer
    relate = lyrCIM.featureTable.relates[0]
    
    # Get the data connection properties for the relate
    dc = relate.dataConnection
    
    # Change the connection string to point to the new File Geodatabase
    dc.workspaceConnectionString = dc.workspaceConnectionString.replace("FGDB.gdb", newGDB)
    
    # Change the dataset name
    dc.dataset = newFeatureClass
        
    # Change the relate's name
    relate.name = newRelateName
    
    # Set the layer's CIM definition
    l.setDefinition(lyrCIM)
    
    p.saveACopy(r"C:\Projects\New\USA.aprx")
  2. The following example references a CAD layer in a map. It will then update the layer to point to a new CAD file. The script assumes that the new CAD file is in the same folder as the previous CAD file.

    Nota:

    Updating the data source for CAD layers to point to a new CAD file requires modifying the CIM. However, just changing the folder that the CAD file resides in can be accomplished using the updateConnectionProperties function.

    import arcpy
    
    aprx = arcpy.mp.ArcGISProject(r"C:\Projects\YosemiteNP\Yosemite.aprx")
    m = aprx.listMaps('CAD')[0]
    # Select the CAD sub layer to update
    lyr = m.listLayers('Parcels')[0]
    
    # Access layer CIM
    lyrCIM = lyr.getDefinition("V3")
    dc = lyrCIM.featureTable.dataConnection
    
    # Update the feature dataset with the new CAD file name 
    dc.featureDataset = "NewParcels.dwg"
    
    # Update layer CIM
    lyr.setDefinition(lyrCIM)
    
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  3. The following example updates the SQL query for a query layer by replacing the existing SQL where clause with a new where clause.

    Nota:

    This sample only replaces a value in the SQL where clause. If you add or remove fields in the SQL query, you will also need to update the featureTable.dataConnection.queryFields list accordingly.

    import arcpy
    
    # Reference project, map and layer 
    p = arcpy.mp.ArcGISProject(r'C:\Projects\USA.aprx')
    m = p.listMaps('USA')[0]
    l = m.listLayers('States')[0]
    
    # Get the layer's CIM definition
    lyrCIM = l.getDefinition('V3')         
    
    # Update the sql query where clause for the layer
    sql = lyrCIM.featureTable.dataConnection.sqlQuery
    newsql = sql.replace("WHERE SUB_REGION = 'Mtn'", "WHERE SUB_REGION = 'Pacific'")
    lyrCIM.featureTable.dataConnection.sqlQuery = newsql
        
    # Set the layer's CIM definition
    l.setDefinition(lyrCIM)
    
    p.saveACopy(r'C:\Projects\USA_New.aprx')

Changing a layer's dataset

Some scenarios require you to change a layer's data source to a feature class with a different name. For example, the database schema changed or a feature class is updated with a new name. There are two ways to accomplish this:

  • Use the connectionProperties dictionary that is available on the Layer and Table classes.
  • Use python CIM access.

Using the connectionProperties dictionary to change a layer's dataset

  1. The following example updates the data source's dataset name from RangerStations to RangerStationsNew. It also updates the geodatabase from Yosemite.gdb to YosemiteNew.gdb.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    lyr = aprx.listMaps("Main*")[0].listLayers("Ranger Stations")[0]
    find_dict = {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\Yosemite.gdb'}, 
                 'dataset': 'RangerStations', 
                 'workspace_factory': 'File Geodatabase'}
    replace_dict = {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\YosemiteNew.gdb'}, 
                    'dataset': 'RangerStationsNew', 
                    'workspace_factory': 'File Geodatabase'}
    lyr.updateConnectionProperties(find_dict, replace_dict)
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  2. The following example updates the data source's dataset name from PtsInterest to PointsOfInterest for layers in a project. This example doesn't change the geodatabase in which the feature class resides. Rather, it updates the layers to point to a different feature class in the same geodatabase by using a partial dictionary.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    aprx.updateConnectionProperties({'dataset': 'PtsInterest'}, {'dataset': 'PointsOfInterest'})
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  3. The following example searches for layers in a project that reference a particular feature class. It then updates the data source's dataset name from PtsInterest to PointsOfInterest.

    import arcpy
    
    oldDataset = 'PtsInterest'
    newDataset = 'PointsOfInterest'
    fGDB = r'C:\Projects\YosemiteNP\Yosemite.gdb'
    
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    
    for m in aprx.listMaps():
        for lyr in m.listLayers():
            if lyr.supports("DATASOURCE"):
                if lyr.dataSource == os.path.join(fGDB, oldDataset):
                    lyr.updateConnectionProperties({'dataset': oldDataset}, {'dataset': newDataset})
                    
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")

Using the CIM to change a layer's dataset

  1. This script updates the layer's data source to a new feature class in the same geodatabase.

    import arcpy
    
    newFeatureClass = "UpdatedBoundary"
    
    # Reference project, map and layer 
    p = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    m = p.listMaps('Park Map')[0]
    l = m.listLayers('Boundary')[0]
    
    # Get the layer's CIM definition
    lyrCIM = l.getDefinition('V3')         
    
    # Update the dataset
    lyrCIM.featureTable.dataConnection.dataset = newFeatureClass
        
    # Set the layer's CIM definition
    l.setDefinition(lyrCIM)
    
    p.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  2. This script searches all the broken layers in a project for a specific feature class that has been renamed. It then updates the layer's data source to the new feature class.

    import arcpy
    
    newFeatureClass = "UpdatedBoundary"
    
    # Reference project
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    
    # Search for specific broken layers
    for m in aprx.listMaps():
        for lyr in m.listBrokenDataSources():
            if lyr.supports("DATASOURCE"):
                if lyr.dataSource == r'C:\Projects\YosemiteNP\Yosemite.gdb\Boundary':
                    # Get the layer's CIM definition
                    lyrCIM = lyr.getDefinition('V3')         
                    # Update the dataset
                    lyrCIM.featureTable.dataConnection.dataset = newFeatureClass
                    # Set the layer's CIM definition
                    lyr.setDefinition(lyrCIM)
    
    aprx.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  3. This script changes the data source of a layer where the feature class and feature dataset names are different between the existing and new geodatabase.

    import arcpy
    
    # Specify the new geodatabase properties
    newGDB = "UpdatedParcels.gdb"
    newFeatureClass = "UpdatedParcelsFC"
    newFeatureDataSet = "UpdatedParcelsFDS"
    
    # Reference project, map and layer 
    p = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    m = p.listMaps('Parcels Map')[0]
    l = m.listLayers('Parcels')[0]
    
    # Get the layer's CIM definition
    lyrCIM = l.getDefinition('V3')         
    
    # Get the data connection properties for the layer
    dc = lyrCIM.featureTable.dataConnection
    
    # Change the connection string to point to the new File Geodatabase
    dc.workspaceConnectionString = dc.workspaceConnectionString.replace("Parcels.gdb", newGDB)
    
    # Change the dataset name
    dc.dataset = newFeatureClass
    
    # If the data is in a Feature Dataset, then update it 
    if hasattr(dc, "featureDataset"):
        dc.featureDataset = newFeatureDataSet
        
    # Set the layer's CIM definition
    l.setDefinition(lyrCIM)
    
    p.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")
  4. In some workflows, the CIM structure of the new data source is different than the existing structure, requiring you to create a new CIM data connection object. In this example, a layer that did not reside in a feature dataset is being updated to reference a feature class that is in a feature dataset. This requires creating a new CIM data connection object, as the feature dataset attribute is not in the existing layer's CIM structure. Note that you do not have to explicitly set the feature dataset attribute in the code. You only have to specify the feature class, and the feature dataset will be populated automatically. The same code can be used to update the data source of a layer that resides in a feature dataset to a feature class that does not reside in a feature dataset.

    import arcpy
    
    # Specify the new geodatabase properties
    newGDB = r'C:\Projects\Data\NewParcels.gdb'
    newFeatureClass = "UpdatedParcelsFC"
    
    # Reference project, map and layer 
    p = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    m = p.listMaps('Parcels Map')[0]
    l = m.listLayers('Parcels')[0]
    
    # Get the layer's CIM definition
    lyrCIM = l.getDefinition('V3')         
    
    # Create a new CIM data connection
    dc = arcpy.cim.CreateCIMObjectFromClassName('CIMStandardDataConnection', 'V3')
    
    # Specify the geodatabase
    dc.workspaceConnectionString = f"DATABASE={newGDB}"
    
    # Specify the workspace type
    dc.workspaceFactory = "FileGDB"
    
    # Specify the dataset name
    dc.dataset = newFeatureClass
    
    # Set the new data connection to the layer's CIM featureTable
    lyrCIM.featureTable.dataConnection = dc
        
    # Set the layer's CIM definition
    l.setDefinition(lyrCIM)
    
    p.saveACopy(r"C:\Projects\YosemiteNP\YosemiteNew.aprx")