Create Cloud Storage Connection File (Data Management)

Summary

Creates a connection file for ArcGIS-supported cloud storage. This tool allows existing raster geoprocessing tools to write cloud raster format (CRF) datasets into the cloud storage bucket or read raster datasets (not limited to CRF) stored in the cloud storage as input.

Usage

  • You must provide the necessary information to make a cloud storage connection—such as access key ID, secret access key string, and bucket name—to run this tool. You can also use alternate authentication methods using the Provider options parameter.

    Learn more about connecting to cloud stores

  • The tool outputs a binary cloud storage connection file (.acs) in ArcGIS Cloud Storage format.

  • The raster dataset stored in the cloud storage can be referenced through a file path such as c:/temp/amazons3.acs/someraster. If the dataset is stored in a folder in the bucket, the folder name must be included in the path, for example, c:/temp/amazons3.acs/foldername/someraster.

  • This tool supports connections to Amazon Simple Storage Service (S3) buckets, Microsoft Azure Blob Storage containers, Microsoft Azure Data Lake Storage Storage Gen2, Alibaba Cloud Object Storage Service (OSS) buckets, Google Cloud Storage Service (GCS) buckets, WebHDFS, MinIO, and Ozone Object Storage Service buckets.

  • A custom endpoint URL can be used to support any S3 compliant object storage or CNAME record.

  • The tool validates the credentials provided at run time. If the connection cannot be made, the tool returns a warning.

  • OAuth 2.0 authentication with Azure Active Directory is supported with user-interactive login for Azure and Azure Data Lake Storage Gen 2. See Connect to authentication providers from ArcGIS Pro for configuration. Once configured and the user is signed in, the authentication parameter becomes active for Azure and Azure Data Lake Storage Gen2. Region and Endpoint will be retrieved from the configuration.

  • Support for user-assigned managed identities using Azure Active Directory on an Azure virtual machine is available for Azure and Azure Data Lake Storage Gen2.

  • Role-based access control (RBAC) is available for Amazon,Azure, and Google cloud providers. Keeping all authentication parameters empty while using an EC2 or Azure virtual machine or Google Compute Engine will enable ArcGIS Pro to access Blob storage using IAM roles or Azure RBAC. For Amazon, IMDSv1 and IMDSv2 are supported.

    Microsoft Azure Data Lake Storage Storage Gen2 follows the same options as Azure but provides true directory support and atomic operations using a DFS endpoint. Some network errors during cloud operations are retried following exponential backoff.

    For performance considerations and additional information, see the GDAL virtual file systems documentation.

Parameters

LabelExplanationData Type
Connection File Location

The folder path where the connection file will be created.

Folder
Connection File Name

The name of the cloud storage connection file.

String
Service Provider

Specifies the cloud storage service provider that will be used.

  • AzureThe service provider will be Microsoft Azure.
  • AmazonThe service provider will be Amazon S3.
  • GoogleThe service provider will be Google Cloud Storage.
  • AlibabaThe service provider will be Alibaba Cloud Storage.
  • WebHDFSThe service provider will be WebHDFS.
  • MinIOThe service provider will be MinIO.
  • Azure Data LakeThe service provider will be Microsoft Azure Data Lake Storage.
  • OzoneThe service provider will be Ozone.
String
Bucket (Container) Name

The name of the cloud storage container where the raster dataset will be stored. Many cloud providers also call it a bucket.

String
Access Key ID (Account Name)
(Optional)

The access key ID string for the specific cloud storage type. It can also be the account name, as is the case with Azure.

String
Secret Access Key (Account Key)
(Optional)

The secret access key string to authenticate the connection to cloud storage.

Encrypted String
Region (Environment)
(Optional)

The region string for the cloud storage. If provided, the value must use the format defined by the cloud storage choice. The default is the selected cloud provider's default account.

String
Service End Point
(Optional)

The service endpoint (URI) of the cloud storage, such as oss-us-west-1.aliyuncs.com. If no value is provided, the default endpoint for the selected cloud storage type will be used. The CNAME redirected endpoint can also be used if needed.

String
Provider Options
(Optional)

The configuration options pertaining to the specific type of cloud service. Some services offer options, some do not. You only need to set this parameter if you want to turn on the options.

  • Azure and Microsoft Azure Data Lake Storage
    • AZURE_STORAGE_SAS_TOKEN—Specify a shared access signature. Ensure that its value is URL encoded and does not contain leading '?' or '&' characters. When using this option, the Secret Access Key (Account Key) parameter must be empty.
    • AZURE_NO_SIGN_REQUEST—Anonymously connect to buckets (containers) that don't require authenticated access. When using this option, the Secret Access Key (Account Key) parameter must be empty. The default value is False
    • AZURE_STORAGE_CONNECTION_STRING—Specify an Azure Storage connection string. This string embeds the account name, key, and endpoint. When using this option, the Access Key ID (Account Name) and Secret Access Key (Account Key) parameters must be empty.
    • CPL_AZURE_USE_HTTPS—Set to False to use HTTP requests. Some servers may be configured to only support HTTPS requests. The default value is True.
    • AZURE_IMDS_OBJECT_ID—Specify the Object ID of the managed identity authenticated using Azure Instance Metadata Service (IMDS) if your Azure VM has multiple user-assigned managed identities set.
    • AZURE_IMDS_CLIENT_ID—Specify the Client ID of the managed identity authenticated using Azure Instance Metadata Service (IMDS) if your Azure VM has multiple user-assigned managed identities set.
    • AZURE_IMDS_MSI_RES_ID—Specify the Resource ID of the managed identity authenticated using Azure Instance Metadata Service (IMDS) if your Azure VM has multiple user-assigned managed identities set.
  • Amazon and MinIO
    • AWS_NO_SIGN_REQUEST—Anonymously connect to buckets (containers) that don't require authenticated access. The default value is False.
    • AWS_SESSION_TOKEN—Specify temporary credentials.
    • AWS_DEFAULT_PROFILEAWS credential profiles are automatically used when the access key or ID is missing. This option can be used to specify the profile to use.
    • AWS_REQUEST_PAYERRequester Pays buckets can be accessed by setting this option to requester.
    • AWS_Virtual_Hosting—If you use Amazon S3 or S3-compatible cloud providers that support only path-style requests, set this option to True. It is recommended that you use virtual hosting if it's supported. The default value is True.
    • CPL_VSIS3_USE_BASE_RMDIR_RECURSIVE—Some older S3-compatible implementations do not support the Bulk Delete operation. Set this option to False for these providers. The default value is True.
    • AWS_HTTPS—Set to False to use HTTP requests. Some servers may be configured to only support HTTPS requests. The default value is True.
  • Google
    • GS_NO_SIGN_REQUEST—Anonymously connect to buckets (containers) that do not require authenticated access. The default value is True
    • GS_USER_PROJECTRequester Pays buckets can be accessed by setting OAuth2 keys and a project for billing. Set the project using this option and set OAuth2 keys using other options and not HMAC keys as a secret access key or ID.
    • GS_OAUTH2_REFRESH_TOKEN—Specify OAuth2 Refresh Access Token. Set OAuth2 client credentials using GS_OAUTH2_CLIENT_ID and GS_OAUTH2_CLIENT_SECRET.
    • GOOGLE_APPLICATION_CREDENTIALS—Specify Service Account OAuth2 credentials using a .json file containing a private key and client email address.
    • GS_OAUTH2_ PRIVATE_KEY—Specify Service Account OAuth2 credentials using a private key string. GS_AUTH2_CLIENT_EMAIL must be set.
    • GS_OAUTH2_ PRIVATE_KEY_FILE—Specify Service Account OAuth2 credentials using a private key from a file. GS_AUTH2_CLIENT_EMAIL must be set.
    • GS_AUTH2_CLIENT_EMAIL—Specify Service Account OAuth2 credentials using a client email address.
    • GS_AUTH2_SCOPE—Specify Service Account OAuth2 scope. Valid values are https://www.googleapis.com/auth/devstorage.read_write (the default) and https://www.googleapis.com/auth/devstorage.read_only.
    • GDAL_HTTP_HEADER_FILE—Specify bearer authentication credentials stored in an external file.
  • Alibaba
    • OSS_Virtual_Hosting—If you use Alibaba or S3-compatible cloud providers that support only path-style requests, set this option to True. It is recommended that you use virtual hosting if it's supported. The default value is True.
    • OSS_HTTPS—Set to False to use HTTP requests. Some servers may be configured to only support HTTPS requests. The default value is True.
  • WebHDFS
    • WEBHDFS_REPLICATION (integer)—The replication value is used when creating a file
    • WEBHDFS_PERMISSION (decimal)—A permission mask is used when creating a file.

If multiple authentication parameters are provided, precedence is as follows:

  • AzureAZURE_STORAGE_CONNECTION_STRING, account name or key, AZURE_STORAGE_SAS_TOKEN, AZURE_NO_SIGN_REQUEST, or RBAC.
  • AmazonAWS_NO_SIGN_REQUEST, access ID or key or AWS_SESSION_TOKEN, AWS Credential Profile, or IAM Role.
  • GoogleGS_NO_SIGN_REQUEST, access ID or key, GDAL_HTTP_HEADER_FILE, (GS_OAUTH2_REFRESH_TOKEN or GS_OAUTH2_CLIENT_ID and GS_OAUTH2_CLIENT_SECRET), GOOGLE_APPLICATION_CREDENTIALS, (GS_OAUTH2_PRIVATE_KEY or GS_OAUTH2_CLIENT_EMAIL), (GS_OAUTH2_PRIVATE_KEY_FILE or GS_OAUTH2_CLIENT_EMAIL), or IAM Role.
  • Ozone
    • AWS_DEFAULT_PROFILEAWS credential profiles are automatically used when the access key or ID is missing. This option can be used to specify the profile to use.
    • AWS_Virtual_Hosting—If you use Amazon S3 or S3-compatible cloud providers that support only path-style requests, set this option to True. It is recommended that you use virtual hosting if it's supported. The default value is True.
    • AWS_HTTPS—Set to False to use HTTP requests. Some servers may be configured to only support HTTPS requests. The default value is True.
    • CPL_VSIS3_USE_BASE_RMDIR_RECURSIVE—Some older S3-compatible implementations do not support the Bulk Delete operation. Set this option to False for these providers. The default value is True.
    • x-amz-storage-class—Specify REDUCED_REDUNDANCY for writing to a single container ozone as it has a single data node.

In addition to the provider options listed above, the ARC_DEEP_CRAWL option can be used with all the service providers. If True, it is used to identify CRFs with no extension and cloud-enabled raster products in the cloud. This is operation intensive and it is recommended that you set this option to False for faster catalog browsing and crawling. The default value is True.

Custom token vending services—such as Planetary Computer's data collection, for example—can be authenticated using the ARC_TOKEN_SERVICE_API (URL of the token vendor) and ARC_TOKEN_OPTION_NAME (type of token from the service provider) provider options.

Note:

The GDAL_DISABLE_READDIR_ON_OPEN option is available with all the service providers. To improve the performance of loading cloud-based rasters, this option is set to NO by default. If the raster resides in a folder that contains more than 30,000 items, set this option to YES.

Value Table
Folder
(Optional)

The folder in the Bucket (Container) Name parameter value where the raster dataset will be stored.

String
Authentication
(Optional)

The connection name of OAuth 2.0 authentication.

A valid connection needs to be configured in the Options dialog box on the Authentication tab.

String

Derived Output

LabelExplanationData Type
Output Connection File

The output cloud storage connection file path.

File

arcpy.management.CreateCloudStorageConnectionFile(out_folder_path, out_name, service_provider, bucket_name, {access_key_id}, {secret_access_key}, {region}, {end_point}, {config_options}, {folder}, {authentication})
NameExplanationData Type
out_folder_path

The folder path where the connection file will be created.

Folder
out_name

The name of the cloud storage connection file.

String
service_provider

Specifies the cloud storage service provider that will be used.

  • AZUREThe service provider will be Microsoft Azure.
  • AMAZONThe service provider will be Amazon S3.
  • GOOGLEThe service provider will be Google Cloud Storage.
  • ALIBABAThe service provider will be Alibaba Cloud Storage.
  • WEBHDFSThe service provider will be WebHDFS.
  • MINIOThe service provider will be MinIO.
  • AZUREDATALAKEThe service provider will be Microsoft Azure Data Lake Storage.
  • OZONEThe service provider will be Ozone.
String
bucket_name

The name of the cloud storage container where the raster dataset will be stored. Many cloud providers also call it a bucket.

String
access_key_id
(Optional)

The access key ID string for the specific cloud storage type. It can also be the account name, as is the case with Azure.

String
secret_access_key
(Optional)

The secret access key string to authenticate the connection to cloud storage.

Encrypted String
region
(Optional)

The region string for the cloud storage. If provided, the value must use the format defined by the cloud storage choice. The default is the selected cloud provider's default account.

String
end_point
(Optional)

The service endpoint (URI) of the cloud storage, such as oss-us-west-1.aliyuncs.com. If no value is provided, the default endpoint for the selected cloud storage type will be used. The CNAME redirected endpoint can also be used if needed.

String
config_options
[config_options,...]
(Optional)

The configuration options pertaining to the specific type of cloud service. Some services offer options, some do not. You only need to set this parameter if you want to turn on the options.

  • Azure and Microsoft Azure Data Lake Storage
    • AZURE_STORAGE_SAS_TOKEN—Specify a shared access signature. Ensure that its value is URL encoded and does not contain leading '?' or '&' characters. When using this option, the Secret Access Key (Account Key) parameter must be empty.
    • AZURE_NO_SIGN_REQUEST—Anonymously connect to buckets (containers) that don't require authenticated access. When using this option, the Secret Access Key (Account Key) parameter must be empty. The default value is False
    • AZURE_STORAGE_CONNECTION_STRING—Specify an Azure Storage connection string. This string embeds the account name, key, and endpoint. When using this option, the Access Key ID (Account Name) and Secret Access Key (Account Key) parameters must be empty.
    • CPL_AZURE_USE_HTTPS—Set to False to use HTTP requests. Some servers may be configured to only support HTTPS requests. The default value is True.
    • AZURE_IMDS_OBJECT_ID—Specify the Object ID of the managed identity authenticated using Azure Instance Metadata Service (IMDS) if your Azure VM has multiple user-assigned managed identities set.
    • AZURE_IMDS_CLIENT_ID—Specify the Client ID of the managed identity authenticated using Azure Instance Metadata Service (IMDS) if your Azure VM has multiple user-assigned managed identities set.
    • AZURE_IMDS_MSI_RES_ID—Specify the Resource ID of the managed identity authenticated using Azure Instance Metadata Service (IMDS) if your Azure VM has multiple user-assigned managed identities set.
  • Amazon and MinIO
    • AWS_NO_SIGN_REQUEST—Anonymously connect to buckets (containers) that don't require authenticated access. The default value is False.
    • AWS_SESSION_TOKEN—Specify temporary credentials.
    • AWS_DEFAULT_PROFILEAWS credential profiles are automatically used when the access key or ID is missing. This option can be used to specify the profile to use.
    • AWS_REQUEST_PAYERRequester Pays buckets can be accessed by setting this option to requester.
    • AWS_Virtual_Hosting—If you use Amazon S3 or S3-compatible cloud providers that support only path-style requests, set this option to True. It is recommended that you use virtual hosting if it's supported. The default value is True.
    • CPL_VSIS3_USE_BASE_RMDIR_RECURSIVE—Some older S3-compatible implementations do not support the Bulk Delete operation. Set this option to False for these providers. The default value is True.
    • AWS_HTTPS—Set to False to use HTTP requests. Some servers may be configured to only support HTTPS requests. The default value is True.
  • Google
    • GS_NO_SIGN_REQUEST—Anonymously connect to buckets (containers) that do not require authenticated access. The default value is True
    • GS_USER_PROJECTRequester Pays buckets can be accessed by setting OAuth2 keys and a project for billing. Set the project using this option and set OAuth2 keys using other options and not HMAC keys as a secret access key or ID.
    • GS_OAUTH2_REFRESH_TOKEN—Specify OAuth2 Refresh Access Token. Set OAuth2 client credentials using GS_OAUTH2_CLIENT_ID and GS_OAUTH2_CLIENT_SECRET.
    • GOOGLE_APPLICATION_CREDENTIALS—Specify Service Account OAuth2 credentials using a .json file containing a private key and client email address.
    • GS_OAUTH2_ PRIVATE_KEY—Specify Service Account OAuth2 credentials using a private key string. GS_AUTH2_CLIENT_EMAIL must be set.
    • GS_OAUTH2_ PRIVATE_KEY_FILE—Specify Service Account OAuth2 credentials using a private key from a file. GS_AUTH2_CLIENT_EMAIL must be set.
    • GS_AUTH2_CLIENT_EMAIL—Specify Service Account OAuth2 credentials using a client email address.
    • GS_AUTH2_SCOPE—Specify Service Account OAuth2 scope. Valid values are https://www.googleapis.com/auth/devstorage.read_write (the default) and https://www.googleapis.com/auth/devstorage.read_only.
    • GDAL_HTTP_HEADER_FILE—Specify bearer authentication credentials stored in an external file.
  • Alibaba
    • OSS_Virtual_Hosting—If you use Alibaba or S3-compatible cloud providers that support only path-style requests, set this option to True. It is recommended that you use virtual hosting if it's supported. The default value is True.
    • OSS_HTTPS—Set to False to use HTTP requests. Some servers may be configured to only support HTTPS requests. The default value is True.
  • WebHDFS
    • WEBHDFS_REPLICATION (integer)—The replication value is used when creating a file
    • WEBHDFS_PERMISSION (decimal)—A permission mask is used when creating a file.

If multiple authentication parameters are provided, precedence is as follows:

  • AzureAZURE_STORAGE_CONNECTION_STRING, account name or key, AZURE_STORAGE_SAS_TOKEN, AZURE_NO_SIGN_REQUEST, or RBAC.
  • AmazonAWS_NO_SIGN_REQUEST, access ID or key or AWS_SESSION_TOKEN, AWS Credential Profile, or IAM Role.
  • GoogleGS_NO_SIGN_REQUEST, access ID or key, GDAL_HTTP_HEADER_FILE, (GS_OAUTH2_REFRESH_TOKEN or GS_OAUTH2_CLIENT_ID and GS_OAUTH2_CLIENT_SECRET), GOOGLE_APPLICATION_CREDENTIALS, (GS_OAUTH2_PRIVATE_KEY or GS_OAUTH2_CLIENT_EMAIL), (GS_OAUTH2_PRIVATE_KEY_FILE or GS_OAUTH2_CLIENT_EMAIL), or IAM Role.
  • Ozone
    • AWS_DEFAULT_PROFILEAWS credential profiles are automatically used when the access key or ID is missing. This option can be used to specify the profile to use.
    • AWS_Virtual_Hosting—If you use Amazon S3 or S3-compatible cloud providers that support only path-style requests, set this option to True. It is recommended that you use virtual hosting if it's supported. The default value is True.
    • AWS_HTTPS—Set to False to use HTTP requests. Some servers may be configured to only support HTTPS requests. The default value is True.
    • CPL_VSIS3_USE_BASE_RMDIR_RECURSIVE—Some older S3-compatible implementations do not support the Bulk Delete operation. Set this option to False for these providers. The default value is True.
    • x-amz-storage-class—Specify REDUCED_REDUNDANCY for writing to a single container ozone as it has a single data node.

In addition to the provider options listed above, the ARC_DEEP_CRAWL option can be used with all the service providers. If True, it is used to identify CRFs with no extension and cloud-enabled raster products in the cloud. This is operation intensive and it is recommended that you set this option to False for faster catalog browsing and crawling. The default value is True.

Custom token vending services—such as Planetary Computer's data collection, for example—can be authenticated using the ARC_TOKEN_SERVICE_API (URL of the token vendor) and ARC_TOKEN_OPTION_NAME (type of token from the service provider) provider options.

Note:

The GDAL_DISABLE_READDIR_ON_OPEN option is available with all the service providers. To improve the performance of loading cloud-based rasters, this option is set to NO by default. If the raster resides in a folder that contains more than 30,000 items, set this option to YES.

Value Table
folder
(Optional)

The folder in the bucket_name parameter value where the raster dataset will be stored.

String
authentication
(Optional)

The connection name of OAuth 2.0 authentication

String

Derived Output

NameExplanationData Type
out_connection

The output cloud storage connection file path.

File

Code sample

CreateCloudStorageConnectionFile example 1 (Python window)

This is a Python sample for the CreateCloudStorageConnectionFile function.

#====================================
# CreateCloudStorageConnectionFile
# Usage:
# arcpy.management.CreateCloudStorageConnectionFile(
#     out_folder_path, out_name, AZURE | AMAZON | GOOGLE | ALIBABA, bucket_name,
#     {access_key_id}, {secret_access_key}, {region}, {end_point},
#     { {Name} {Value}; {Name} {Value}...})
# arcpy.management.CreateCloudStorageConnectionFile(
#     out_folder_path, out_name, AZURE | AMAZON | GOOGLE | ALIBABA, bucket_name,
#     {access_key_id}, {secret_access_key}, {region}, {end_point},
#     {config_options})

import arcpy

# Create connection to open public bucket with requester pay option
arcpy.management.CreateCloudStorageConnectionFile(
    "C:/Workspace/connections", "awss3storage.acs", "AMAZON", "publicrasterstore",
    config_options="AWS_REQUEST_PAYER requester")

# Create connection to secured Azure bucket
arcpy.management.CreateCloudStorageConnectionFile(
    "C:/Workspace/connections", "azurestorage.acs", "AZURE", "rasterstore", "imageaccount",
    "NOGEOU1238987OUOUNOQEWQWEIO")

# Create Alibaba connection with end points
arcpy.management.CreateCloudStorageConnectionFile(
    "C:/Workspace/connections", "aliyun.acs", "ALIBABA", "rasterstore", "AYOUER9273PJJNY",
"NOGEOU1238987OUOUNOQEWQWEIO", end_point="rasterstore.oss-us-west-1.aliyuncs.com")
CreateCloudStorageConnectionFile example 2 (stand-alone script)

This is a Python sample for the CreateCloudStorageConnectionFile function.

#====================================
# CreateCloudStorageConnectionFile
# Usage:
# arcpy.management.CreateCloudStorageConnectionFile(
#     out_folder_path, out_name, AZURE | AMAZON | GOOGLE | ALIBABA, bucket_name,
#     {access_key_id}, {secret_access_key}, {region}, {end_point},
#     { {Name} {Value}; {Name} {Value}...})
# arcpy.management.CreateCloudStorageConnectionFile(
#     out_folder_path, out_name, AZURE | AMAZON | GOOGLE | ALIBABA, bucket_name,
#     {access_key_id}, {secret_access_key}, {region}, {end_point},
#     {config_options})

import arcpy

outfolder = "C:/Workspace/connections"
connectname = "googlecloudos.acs"
provider = "GOOGLE"
accesskey = "AYOUER9273PJJNY"
secretkey = "NOGEOU1238987OUOUNOQEWQWEIO"
bucketname = "rasterstore"

# Create connection to Google cloud object storage
arcpy.management.CreateCloudStorageConnectionFile(
outfolder, connectname, provider, bucketname, accesskey, secretkey)
CreateCloudStorageConnectionFile example 3 (Python window)

This is a Python sample for the CreateCloudStorageConnectionFile function to connect with token vending services.

#====================================
# CreateCloudStorageConnectionFile
# Usage:
# arcpy.management.CreateCloudStorageConnectionFile(
#     out_folder_path, out_name, AZURE | AMAZON | GOOGLE | ALIBABA, bucket_name,
#     {access_key_id}, {secret_access_key}, {region}, {end_point},
#     { {Name} {Value}; {Name} {Value}...})
# arcpy.management.CreateCloudStorageConnectionFile(
#     out_folder_path, out_name, AZURE | AMAZON | GOOGLE | ALIBABA, bucket_name,
#     {access_key_id}, {secret_access_key}, {region}, {end_point},
#     {config_options})

import arcpy

outfolder = "C:/Workspace/connections"
connectname = "planetary_landsat.acs"
provider = "Azure"
accesskey = "landsateuwest"
secretkey = ""
bucketname = "landsat-c2"
folder = ""
region = ""
endpoint = ""
config_options= "ARC_TOKEN_OPTION_NAME AZURE_STORAGE_SAS_TOKEN; ARC_TOKEN_SERVICE_API https://planetarycomputer.microsoft.com/api/sas/v1/token/landsateuwest/landsat-c2"

# Create connection to planetary computer landsat data collection

print(arcpy.CreateCloudStorageConnectionFile_management(outfolder, connectname, provider, bucketname, accesskey, secretkey, region, endpoint, config_options, folder))
print(arcpy.GetMessages())

Environments

This tool does not use any geoprocessing environments.

Licensing information

  • Basic: Yes
  • Standard: Yes
  • Advanced: Yes

Related topics