Summary
Updates the properties of a big data connection (BDC) dataset. This tool modifies field, geometry, time, and file settings for a specified BDC dataset.
Usage
This tool requires a BDC. To create a BDC, use the Create Big Data Connection tool.
Use this tool to modify BDC dataset schema, geometry, or time for use in analysis or visualization in scenarios such as the following:
- Your CSV dataset was registered with all string type fields and you want to set the fields as numeric for use in analysis.
- Your BDC dataset has attribute values for two separate locations, such as taxi pick-up and taxi drop-off spots, and you want to change the geometry you use for analysis.
- Your workflow requires that time is set on the input layer.
- You want to share a BDC dataset with a colleague who is only interested in a subset of features, so you add a definition query expression and hide some unused fields.
You can modify the following properties:
- Definition query—An expression used to limit the features used in analysis.
- Fields—The field name, field type, and visibility.
- Geometry—How geometry is represented. These are not editable for shapefiles.
- Time—How time is represented.
- File—The file properties used to read the dataset.
Specify the BDC dataset with the properties you want to modify using the Big Data Connection Dataset parameter. You can browse to the dataset or specify it using a pathway such as c:\<path>\MyBDC.bdc\<dataset_name>, for example, c:\MyBDCFolder\MyBDC.bdc\earthquakes_dataset.
Define an expression to limit the features used in analysis using the Expression parameter. Adding a filter to a BDC dataset is similar to applying a definition query to a dataset in your map: specify a SQL expression to filter features of interest.
You can update the field type for delimited files. You cannot update the field type for other data sources (such as shapefile, ORC or parquet files).
You can modify the geometry for delimited files, ORC, and parquet files. You cannot modify the geometry for a shapefile-sourced dataset.
The following table outlines how to specify time formats for the Start Time and End Time parameters when you edit a BDC dataset. The examples show how to represent the time January 2, 2016, at 9:45:02.05 PM.
Time formats in big data connections
Symbol Meaning Example yy
The year, represented by two digits.
16
yyyy
The year, represented by four digits.
2016
MM
The month, represented numerically.
01 or 1
MMM
The month, represented using three letters.
Jan
MMMM
The month, represented using the complete spelling.
January
dd
The day.
02 or 2
HH
The hour when using a 24-hour day; values range from 0-23.
21
hh
The hour when using a 12-hour day; values range from 1-12.
9
mm
The minute; values range from 0-59.
45
ss
The second; values range from 0-59.
02
SSS
The millisecond; values range from 0-999.
50
a
The AM/PM marker.
PM
epoch_millis
The time in milliseconds from epoch.
1509581781000
epoch_seconds
The time in seconds from epoch.
1509747601
Z
The time zone offset expressed in hours.
-0100 or -01:00
ZZZ
The time zone offset expressed using IDs.
America/Los_Angeles
''
Use single quotes to add text that doesn't represent a value outlined in this table.
'T'
The following table shows examples for different formats of the same date, January 2, 2016, at 9:45:02.05 PM:
Time format examples
You can specify the time zone using one of the following:Input date Date format 01/02/2016 9:45:02PM
MM/dd/yyyy hh:mm:ssa
Jan02-16 21:45:02
MMMdd-yy HH:mm:ss
January 02 2016 9:45:02.050PM
MMMM dd yyyy hh:mm:ss.SSSa
01/02/2017T9:45:14:05-0000
MM/dd/yyyy'T'HH:mm:ssZ
- The full name of the time zone: Pacific Standard Time
- The time zone offset expressed in hours: -0100 or -01:00
- The UTC or GMT abbreviation
You can modify the following properties of a delimited file:
- Field Delimiter—The delimiter for each field. Common delimiters are , and ;.
- Record Terminator—The terminator for each row of data. Common terminators are \n and \t.
- Quote Character—The character used for quotes in the source dataset.
- Has Header Row—A true or false value indicating whether the source dataset includes headers. If a header row is included in the dataset, the headers will be used for the field names.
- Encoding—The encoding type used by the source dataset. The default is UTF-8.
The Update Big Data Connection Dataset Properties tool updates the properties of an individual dataset. Use the following tools to modify a BDC:
- Copy Dataset From Big Data Connection—Copies a dataset from a BDC to a feature class.
- Duplicate Dataset From Big Data Connection—Creates a view of an existing BDC dataset.
- Refresh Big Data Connection—Checks for any new datasets and add them to the BDC.
- Remove Dataset From Big Data Connection—Removes a dataset from the BDC.
- Update Big Data Connection Dataset Properties—Modifies the properties of an individual BDC dataset.
- Preview Dataset From Big Data Connection—Previews the first ten features in your dataset to verify they are correctly registered.
- Describe Dataset—Allows you to confirm that the dataset displays as expected.
You can optionally edit your BDC file manually. You should always modify the .bdc file manually for the following situations:
- You have one or more fields used to represent the x-,y-, or z-location.
- You want to update the source path.
Learn more about big data connection file formatting.
This geoprocessing tool is powered by Spark. See Big data connections to learn more about big data connections and how to use them.
Syntax
arcpy.gapro.UpdateBDCDatasetProperties(bdc_dataset, {expression}, {field_properties}, {geometry_type}, {spatial_reference}, {geometry_format_type}, {geometry_field}, {x_field}, {y_field}, {z_field}, {time_type}, {time_zone}, {start_time_format}, {end_time_format}, {file_extension}, {field_delimiter}, {record_terminator}, {quote_character}, {has_header_row}, {encoding})
Parameter | Explanation | Data Type |
bdc_dataset | The BDC dataset to update. The options for editing will differ depending on the source data (shapefile, delimited file, ORC, or parquet file). | Table View |
expression (Optional) | An expression used to limit the features that will be used in analysis. | SQL Expression |
field_properties [field_properties,...] (Optional) | Specifies the field names and properties to modify.
Specifies whether fields will be visible or hidden.
| Value Table |
geometry_type (Optional) | Specifies the type of geometry that will be used to spatially represent the dataset. The geometry cannot be modified for shapefile-sourced datasets.
| String |
spatial_reference (Optional) | The WKID value or WKT string that will be used for the spatial reference of the dataset. The default is WKID 4326 (WGS84). The spatial reference cannot be modified for shapefile-sourced data. | String |
geometry_format_type (Optional) | Specifies how the geometry will be formatted. The geometry cannot be modified for shapefile-sourced data.
| String |
geometry_field (Optional) | A single field used to represent the geometry. This field is used when the geometry format is WKT, WKB, GeoJSON, or EsriJSON. | String |
x_field (Optional) | The field used to represent the x-location. If you have more than one field representing the x-location, modify the .bdc file manually. | String |
y_field (Optional) | The field used to represent the y-location. If you have more than one field representing the y-location, modify the .bdc file manually. | String |
z_field (Optional) | The field used to represent the z-location. If you have more than one field representing the z-location, modify the .bdc file manually. | String |
time_type (Optional) | Specifies the time type used to temporally represent the dataset.
| String |
time_zone (Optional) | The time zone of the dataset. | String |
start_time_format [start_time_format,...] (Optional) | The fields used to define the start time and the time formatting. | Value Table |
end_time_format [end_time_format,...] (Optional) | The fields used to define the end time and the time formatting. | Value Table |
file_extension (Optional) | The file extension of the source dataset. The parameter value cannot be modified. | String |
field_delimiter (Optional) | The field delimiter used in the source dataset. | String |
record_terminator (Optional) | The record terminator used in the source dataset. | String |
quote_character (Optional) | The quote character used in the source dataset. | String |
has_header_row (Optional) | Specifies whether the source dataset includes a header row.
| Boolean |
encoding (Optional) | The type of encoding used by the source dataset. By default UTF-8 is used. | String |
Derived Output
Name | Explanation | Data Type |
updated_bdc | The updated BDC file with edited properties applied to the specified dataset. | File |
Code sample
The following Python script demonstrates how to use the UpdateBDCDatasetProperties function.
# Name: UpdateBDCDatasetProperties.py
# Description: Add a filter and modify the schema, time, and geometry for a BDC dataset
# Requirements: ArcGIS Pro Advanced License
# Import system modules
import arcpy
# Set local variables
dataset = r"c:\Projects\MyProjectFolder\my_BigDataConnection.bdc\myBigDataset"
filter = "COUNT > 500"
field_properties = "Field1 FLOAT true;Field2 STRING true;Field3 DOUBLE true"
geometry_type = "POINT"
sref = "4326"
geometry_format = "XYZ"
x_field = "Long"
y_field = "Lat"
z_field = ""
time_type = "INSTANT"
time_zone = "UTC"
time_formats = "Year yyyy"
file_extenstion = "csv"
file_delimitor = ","
file_terminator = r"\n"
file_quotechar = '"'
has_header_row = True
file_encoding = "UTF-8"
# Execute Update BDC Dataset Properties
arcpy.gapro.UpdateBDCDatasetProperties(dataset, filter, field_properties, geometry_type, sref, geometry_format, "",
x_field, y_field, z_field, time_type, time_zone, time_formats, None, file_extension, file_delimitor, file_terminator,
file_quotechar, has_header_row, file_encoding)
Environments
Licensing information
- Basic: No
- Standard: No
- Advanced: Yes