You can configure, visualize, and use multifile feature connections (MFC) in analysis.
Use a MFC
Once you have structured your data, you can do the following:
- Configure a MFC
- Visualize a MFC dataset
- Use MFC datasets in analysis
Configure a MFC
To get started, you need to create a MFC. There are two ways to create a MFC:
- Using the New Multifile Feature Connection dialog box. To access the dialog box, on the Insert ribbon, click Connections, and select New Multifile Feature Connection. The dialog box provides an interactive experience to create a MFC and configure properties on each dataset.
- Use the Create Multifile Feature Connection geoprocessing tool.
You may run into one of two issues when discovering datasets in your MFC:
- Datasets that you expected are missing. In this case, verify that the path you specified as a source folder that contains subfolders is correct and that it's a supported data type.
- One or more datasets fail to register. If datasets fail to register, you may note some of the following:
Issue Solution Example
The dataset is not in the expected format.
Open the file to see if it looks as expected. If the data is structured incorrectly, update and try again.
A .csv file has a few lines and a summary of the data and then only empty lines.
The schemas of datasets in a folder do not match.
All files in a dataset folder must have the same schema. Open the files to compare the schemas. Resolve any mismatched schemas and try to register the dataset again.
You have one .csv file with 10 fields and another with 8.
The file types of a dataset in a folder do not match.
All files in a dataset folder must have the same extension (file type). Check the file types of the data source location and remove or relocate any misplaced files.
A shapefile dataset is in the same folder as a parquet file.
You have an unrecognized field format.
This is unlikely but may occur if ORC and parquet use an unexpected format. Ensure that you use valid field formats.
You have a parquet file with an unknown field format.
If you create a MFC using a delimited file and don't see header rows, you may have an invalid header row. Ensure that all fields have a header and that none are empty. If you're using the dialog box to create the big data file share, you can update the field headers on the Fields pane. You can also update field names using the Update Multifile Feature Connection Dataset Properties tool.
When you create a MFC, the schema, geometry, and time are discovered for each of your datasets. Often, there are changes you can make as to how the datasets represent those values. To verify that each dataset correctly represents the geometry, time, and fields, use the Describe Dataset geoprocessing tool. For example, when reviewing your datasets, you may want to make one or more of the following changes to one or more datasets in your MFC:
- Change the field names of delimited datasets.
- Modify which fields are visible for analysis.
- Change the fields used to represent geometry or time.
- Add a filter to a dataset.
- Add an alias to a dataset.
- Remove datasets from the MFC that you aren't interested in analyzing.
- Refresh the MFC to include a newly added dataset (a new subfolder under the source folder).
To make these optional changes, you can use the New Multifile Feature Connection dialog box or any combination of the following tools:
- Copy Dataset From Multifile Feature Connection—Copies a dataset from a MFC to a feature class.
- Duplicate Dataset From Multifile Feature Connection—Creates a view of an existing MFC dataset.
- Refresh Multifile Feature Connection—Checks for any new datasets and add them to the MFC.
- Remove Dataset From Multifile Feature Connection—Removes a dataset from the MFC.
- Update Multifile Feature Connection Dataset Properties—Modifies the properties of an individual MFC dataset.
- Preview Dataset From Multifile Feature Connection—Previews the first ten features in your dataset to verify they are correctly registered.
- Describe Dataset—Verifies that the dataset looks as expected.
Visualize a MFC dataset
You can visualize delimited- and shapefile-based MFC datasets on a map.
Note:You cannot visualize MFC datasets that use parquet and ORC source files.
To add your dataset to the map, locate the MFC item in the Catalog pane, click to expand the datasets, and add the dataset to the map.
MFC datasets have a simplified experience in your map and have the following limitations:
- When visualizing MFC datasets, the time properties in the MFC dataset properties are not automatically set in the new layer. To visualize the dataset with time, set the layer's time properties after adding the dataset to the map.
- Drawing delimited files will zoom to the full extent of the MFC dataset's spatial reference.
- If you add new records to an existing MFC dataset—for example, adding new rows to a CSV file in an existing MFC—the new records will not draw until you restart ArcGIS Pro.
- If you add new files to an existing MFC dataset—for example, adding a new CSV file to an existing MFC dataset—the new records will not draw until you restart ArcGIS Pro.
Use MFC datasets in analysis
When MFC datasets are used as input to GeoAnalytics Desktop tools, analysis is optimized to read the data and run in parallel across the cores of your machine. For all other geoprocessing tools, MFC dataset reading and processing is not optimized to run in parallel, rather it is sequential and single-threaded.
You can use MFC datasets based on delimited files or shapefiles in most geoprocessing tools.
Note:MFC datasets using parquet and ORC source files can only be used in GeoAnalytics Desktop tools.
You cannot apply a selection to a MFC dataset when it's used as input to a GeoAnalytics Desktop tool.
To use a MFC dataset in a geoprocessing tool, add a MFC dataset to a map and select the layer name from the parameter choice list or use the browse button to browse to a MFC workspace and select the input dataset. The following tools do not support input MFC files:
- Service-based tools, including GeoAnalytics Server, standard feature analysis, and ArcGIS Online analysis tools
- Tools that modify the input dataset, such as Calculate Field and Near