Skip To Content

Updating and fixing data sources

There are numerous reasons why data sources need to be repaired or redirected to different locations. The idea of making these changes manually in every affected map or project can be overwhelming. Methods are available with the arcpy.mp scripting environment that make it possible to automate these changes without having to open a project. You have control of updating data sources for individual layers or tables, or you can update all the layers or tables in a common workspace at once.

The following members are used with changing data source workflows:

Using the updateConnectionProperties function

The updateConnectionProperties function can be thought of as a find-and-replace function where you replace the current_connection_info parameter with the new_connection_info parameter. These parameters can be either a full path to a workspace, a partial string, a dictionary that contains connection properties, a partial dictionary that defines specific keys, or a path to a database connection (.SDE) file. An example of each of these is provided below.

When working with enterprise geodatabase layers, a path to a database connection (.SDE) file can only be used for the new_connection_info parameter. The information in the .SDE file can't be used for the current_connection_info parameter because the connection information is coming from the layer in the project, not the file. A more complete discussion and examples of connection property dictionaries are also provided in the connectionProperties section below.

When converting from a personal to a file geodatabase, the underlying workspace_factory connection property is automatically switched. This allows you to replace a path pointing to a .mdb file with a path pointing to a .gdb file. This is not the case for switching out other workspace types. You won't be able to only use a path—you'll need to work with a dictionary and change the database and workspace_factory dictionary keys at a minimum.

The auto_update_joins_and_relates property allows you to control whether joins, relates, or events associated with a layer or table should be updated. The default is set to True. There may be times, especially when updating all data sources at the project level, that you do not want these associated sources to be updated. If that is the case, be sure to set this parameter value to False.

The updateConnectionProperties method by default only updates a data source if the new_connection_info is a valid data source. If the validate parameter is set to False, the data source is set to that location regardless of whether it exists. This can be useful for scenarios that require data sources to be updated ahead of the data being created. In these cases, the data appears broken in the associated maps.

Here are several examples:

  1. Full file path:

    The following script changes the full path to a file geodatabase data source for all layers and tables in a project. In this example, a folder was renamed and all vector data was moved to this new location.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    aprx.updateConnectionProperties(r'C:\Projects\YosemiteNP\Data\Yosemite.gdb',
                                    r'C:\Projects\YosemiteNP\Vector_Data\Yosemite.gdb')
    aprx.save()
    del aprx
  2. Partial path:

    The following example is very similar to the one above but uses partial path strings to replace the data sources. Be sure when using a partial string that it doesn't occur multiple times in a path. You may not get the results you would expect.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    aprx.updateConnectionProperties('Data','Vector_Data')
    aprx.save()
    del aprx
  3. Personal geodatabase to file geodatabase:

    The following example replaces a personal geodatabase connection with a file geodatabase connection using a partial path for all layers and tables in a map.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    m = aprx.ListMaps("Yose*")[0]
    m.updateConnectionProperties(r'Background.mdb', 'Background_fGDB.gdb')
    aprx.save()
    del aprx
  4. Using a database connection (.SDE) file:

    The following example replaces a file geodatabase connection with a path to a database connection (.SDE) file for all layers and tables in a project.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    aprx.updateConnectionProperties(r'C:\Projects\YosemiteNP\Vector_Data\Yosemite.gdb',
                                    r'C:\Projects\YosemiteNP\DBConnections\Server.sde')
    aprx.save()
    del aprx
  5. Using the connection properties from another layer:

    The following example references a layer in a map and uses those connection properties to update the connection properties for the same layer in a layer file that has not been updated with the new data source.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    m = aprx.listMaps('Yose*')[0]
    lyr = m.listLayers('Ranger Stations')[0]
    lyrFile = arcpy.mp.LayerFile(r'C:\Projects\YosemiteNP\LYRXs\Yosemite\OperationalLayers.lyrx')
    
    for l in lyrFile.listLayers():
      if l.name == 'Ranger Stations':
        l.updateConnectionProperties(l.connectionProperties, lyr.connectionProperties)
    
    lyrFile.save()
    del aprx, lyrFile
  6. Using a partial dictionary:

    The following example updates the data source's dataset name from PtsInterest to PointsOfInterest for layers in a specific map.

    import arcpy
    aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
    m = aprx.listMaps('Yose*')[0]
    m.updateConnectionProperties({'dataset': 'PtsInterest'}, {'dataset': 'PointsOfInterest'})
    aprx.save()
    del aprx

Using the connectionProperties dictionary

Using connectionProperties for updating data sources requires that you work with a dictionary of connection properties. The dictionary that is returned varies depending on whether it is a file-based workspace or a database connection, or if the layer or table has associated joins or relates. It is because of this variability that it is important to understand the different types of connection properties and how to navigate the dictionaries to make the appropriate changes. For example, a layer with a join or a relate returns a very different result than the same layer without a join or a relate. The easiest approach to updating connection property dictionaries is to reference and retrieve the dictionary from a layer or table with similar connection properties, make the necessary changes to it, and then set the modified dictionary back to the layer or table you want to update using the updateConnectionProperties method.

A nice way to display the dictionary structure is to use the Python pprint function. The following code was used in the Python window to generate each of the example dictionaries.

import arcpy, pprint
p = arcpy.mp.ArcGISProject('current')
m = p.listMaps()[0]
l = m.listLayers()[0]
pprint.pprint(l.connectionProperties)

Examples for some of the different connectionProperties dictionaries are below.

A file-based data source with no joins or relates:

{'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\Yosemite.gdb'}, 
 'dataset': 'RangerStations', 
 'workspace_factory': 'File Geodatabase'}

The above example is the most basic structure. A dictionary with three keys is returned. The value for the connection_info key is another dictionary that contains a path to the database.

The following example modifies the connection properties of an existing layer and redirects the workspace to a new location. This example assumes the workspace_factory and dataset are the same.

import arcpy
aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
lyr = aprx.listMaps("Main*").listLayers("Ranger Stations")[0]
conProp = lyr.connectionProperties
conProp['connection_info']['database'] = r'C:\Projects\YosemiteNP\Data\newfGDB.gdb'
lyr.connectionProperties = conProp
aprx.save()

A partial dictionary can also be used in the updateConnectionProperties method, as can been seen in the example below. The following example updates the data source's dataset name from PtsInterest to PointsOfInterest for layers in a specific map.

import arcpy
aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
m = aprx.listMaps('Yose*')[0]
m.updateConnectionProperties({'dataset': 'PtsInterest'}, {'dataset': 'PointsOfInterest'})
aprx.save()
del aprx

The following example updates the data source's dataset name from RangerStations to RangerStationsNew. It also updates the geodatabase from Yosemite.gdb to YosemiteNew.gdb.

import arcpy
aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
lyr = aprx.listMaps("Main*").listLayers("Ranger Stations")[0]
find_dict = {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\Yosemite.gdb'}, 
             'dataset': 'RangerStations', 
             'workspace_factory': 'File Geodatabase'}
replace_dict = {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\YosemiteNew.gdb'}, 
                'dataset': 'RangerStationsNew', 
                'workspace_factory': 'File Geodatabase'}
lyr.updateConnectionProperties(find_dict, replace_dict)

Using the connectionProperties dictionary with Enterprise Geodatabase data

An Enterprise Geodatabase data source connectionProperties dictionary with no joins or relates:

{'connection_info': {'authentication_mode': 'OSA',                     
                     'database': 'Uhuru',                     
                     'db_connection_properties': 'Dunbar',                     
                     'dbclient': 'sqlserver',                     
                     'instance': 'sde:sqlserver:Dunbar',                     
                     'password': '*********',                     
                     'server': 'Dunbar',                     
                     'user': 'Sly',                     
                     'version': 'sde.DEFAULT'}, 
'dataset': 'Uhuru.SDE.RangerStations', 
'workspace_factory': 'SDE'}

The same three keys are returned, but this time the connection_info value is a dictionary with a larger set of database connection properties.

The following example changes the Enterprise Geodatabase version for all layers and tables in a map.

import arcpy
aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
m = aprx.listMaps("Yose*")[0]
for l in m.listLayers():
  conProp = l.connectionProperties
  conProp['connection_info']['version'] = 'sde.NewVersion'
  l.updateConnectionProperties(l.connectionProperties, conProp)
aprx.save()

The following example changes the Enterprise Geodatabase instance and server for all layers in a project. In this example, the geodatabase uses operating system authentication. If the user names and passwords are the same, the instance and server can be changed without knowing the credentials of layers in the project and without having to create new Enterprise Geodatabase connection files.

import arcpy
aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
find_dict = {'connection_info': {'db_connection_properties': 'Dunbar',
                                 'instance': 'sde:sqlserver:Dunbar',
                                 'server': 'Dunbar'}}
replace_dict = {'connection_info': {'db_connection_properties': 'Shakespeare',
                                    'instance': 'sde:sqlserver:Shakespeare',
                                    'server': 'Shakespeare'}}
aprx.updateConnectionProperties(find_dict, replace_dict)

Using the connectionProperties dictionary with relates

A file-based data source connectionProperties dictionary with two relates:

{'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\BackgroundData.gdb'}, 
 'dataset': 'State_Polygons', 
 'relates': [{'cardinality': 'one_to_many',              
              'connection': {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\BackgroundData.gdb'},                             
                             'dataset': 'census2000',                             
                             'workspace_factory': 'File Geodatabase'},
              'foreign_key': 'STATE_NAME',              
              'name': 'Relate1',              
              'primary_key': 'State_Name'},             
             {'cardinality': 'one_to_many',              
              'connection': {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\BackgroundData.gdb'}, 
                             'dataset': 'census2010',                             
                             'workspace_factory': 'File Geodatabase'},              
              'foreign_key': 'State_Name',              
              'name': 'Relate2',              
              'primary_key': 'State_Name'}], 
 'workspace_factory': 'File Geodatabase'}

Here you can see the dictionary contains more key value pairs. An additional key named relates is present and its value is a Python list of relates. Each item in the list is a dictionary containing the connection information concerning the table that is related. Because it is a list, you can iterate through each relate and manipulate them accordingly.

The following example modifies the source dataset for the relate named Relate2 and references a different table in the same workspace.

import arcpy
aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
lyr = aprx.listMaps("Main*").listLayers("State_Polygons")[0]
conProp = lyr.connectionProperties
for rel in conProp['relates']:
    if rel['name'] == 'Relate2':
        rel['connection']['dataset'] = 'census2012'
lyr.updateConnectionProperties(lyr.connectionProperties, conProp)
aprx.save()
del aprx

Using the connectionProperties dictionary with joins

A file-based data source connectionProperties dictionary with one join:

{'cardinality': 'one_to_many',
 'destination': {'connection_info': {'database': 'C:\\Projects\\FGDB.gdb'},
                 'dataset': 'tabular_eco',
                 'workspace_factory': 'File Geodatabase'},
 'foreign_key': 'ECO_CODE',
 'join_forward': False,
 'join_type': 'left_outer_join',
 'primary_key': 'CODE',
 'source': {'connection_info': {'database': 'C:\\Projects\\FGDB.gdb'},
            'dataset': 'mex_eco',
            'workspace_factory': 'File Geodatabase'}}

A file-based data source connectionProperties dictionary with two joins:

{'cardinality': 'one_to_many', 
 'destination': {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\BackgroundData.gdb'},                 
                 'dataset': 'census2000',                 
                 'workspace_factory': 'File Geodatabase'}, 
 'foreign_key': 'State_Polygons.State_Name', 
 'join_forward': False, 
 'join_type': 'left_outer_join',
 'primary_key': 'STATE_NAME', 
 'source': {'cardinality': 'one_to_many',
            'destination': {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\BackgroundData.gdb'},
                            'dataset': 'census2010',                            
                            'workspace_factory': 'File Geodatabase'},
            'foreign_key': 'State_Name',            
            'join_forward': False,            
            'join_type': 'left_outer_join',            
            'primary_key': 'STATE_NAME',            
            'source': {'connection_info': {'database': 'C:\\Projects\\YosemiteNP\\Data\\BackgroundData.gdb'},
                       'dataset': 'State_Polygons',                       
                       'workspace_factory': 'File Geodatabase'}}}

When joins are associated with a layer or table, the connectionProperties dictionary structure changes. You no longer have the same three root level keys like you saw in previous examples. To understand why this is different, you need to understand how joins are persisted. Joins are nested. If table one and table two are joined to a layer, table one is joined to the layer and table two is joined to the combination of the layer and table one. The root-level dictionary describes the second join first. From the second join's source, you can trace the connection to the original layer and table one.

The following example modifies the foreign key of a join for a specific layer.

import arcpy
aprx = arcpy.mp.ArcGISProject(r'C:\Projects\Mexico\MexicoEcology.aprx')
mexLyr = aprx.listMaps('Layers')[0].listLayers('mex_eco')[0]
conProps = mexLyr.connectionProperties
conProps['foreign_key'] = 'ECO_CODE_NEW'
mexLyr.updateConnectionProperties(mexLyr.connectionProperties, conProps)
aprx.save()
del aprx

A partial dictionary can also be used in the updateConnectionProperties method. The following example modifies the join properties for all layers in the project that use the specified foreign key.

import arcpy
aprx = arcpy.mp.ArcGISProject(r'C:\Projects\Mexico\MexicoEcology.aprx')
mexLyr = aprx.listMaps('Layers')[0].listLayers('mex_eco')[0]
conProps = mexLyr.connectionProperties
conProps['foreign_key'] = 'ECO_CODE_NEW'
aprx.updateConnectionProperties({'foreign_key': 'ECO_CODE'}, {'foreign_key': 'ECO_CODE_NEW'})
aprx.save()
del aprx

The following example modifies the source database and dataset for the primary layer both tables are joined to without changing the connection information for the joins.

import arcpy
aprx = arcpy.mp.ArcGISProject(r'C:\Projects\YosemiteNP\Yosemite.aprx')
lyr = aprx.listMaps("Main*").listLayers("State_Polygons")[0]
conProp = lyr.connectionProperties
conProp['source']['source']['connection_info']['database'] = 'C:\\Projects\\YosemiteNP\\Vector_Data\\Census.gdb'
conProp['source']['source']['dataset'] = 'States'
lyr.updateConnectionProperties(lyr.connectionProperties, conProp)
aprx.save()
del aprx