Use multifile feature connections

You can configure, visualize, and use multifile feature connections (MFC) in analysis.

Use a MFC

Once you have structured your data, you can do the following:

  1. Configure a MFC
  2. Visualize a MFC dataset
  3. Use MFC datasets in analysis

Configure a MFC

To get started, you need to create a MFC. There are two ways to create a MFC:

You may run into one of two issues when discovering datasets in your MFC:

  • Datasets that you expected are missing. In this case, verify that the path you specified as a source folder that contains subfolders is correct and that it's a supported data type.
  • One or more datasets fail to register. If datasets fail to register, you may note some of the following:

    IssueSolutionExample

    The dataset is not in the expected format.

    Open the file to see if it looks as expected. If the data is structured incorrectly, update and try again.

    A .csv file has a few lines and a summary of the data and then only empty lines.

    The schemas of datasets in a folder do not match.

    All files in a dataset folder must have the same schema. Open the files to compare the schemas. Resolve any mismatched schemas and try to register the dataset again.

    You have one .csv file with 10 fields and another with 8.

    The file types of a dataset in a folder do not match.

    All files in a dataset folder must have the same extension (file type). Check the file types of the data source location and remove or relocate any misplaced files.

    A shapefile dataset is in the same folder as a parquet file.

    You have an unrecognized field format.

    This is unlikely but may occur if ORC and parquet use an unexpected format. Ensure that you use valid field formats.

    You have a parquet file with an unknown field format.

If you create a MFC using a delimited file and don't see header rows, you may have an invalid header row. Ensure that all fields have a header and that none are empty. If you're using the dialog box to create the big data file share, you can update the field headers on the Fields pane. You can also update field names using the Update Multifile Feature Connection Dataset Properties tool.

When you create a MFC, the schema, geometry, and time are discovered for each of your datasets. Often, there are changes you can make as to how the datasets represent those values. To verify that each dataset correctly represents the geometry, time, and fields, use the Describe Dataset geoprocessing tool. For example, when reviewing your datasets, you may want to make one or more of the following changes to one or more datasets in your MFC:

  • Change the field names of delimited datasets.
  • Modify which fields are visible for analysis.
  • Change the fields used to represent geometry or time.
  • Add a filter to a dataset.
  • Add an alias to a dataset.
  • Remove datasets from the MFC that you aren't interested in analyzing.
  • Refresh the MFC to include a newly added dataset (a new subfolder under the source folder).

To make these optional changes, you can use the New Multifile Feature Connection dialog box or any combination of the following tools:

Visualize a MFC dataset

You can visualize delimited- and shapefile-based MFC datasets on a map.

Note:
You cannot visualize MFC datasets that use parquet and ORC source files.

To add your dataset to the map, locate the MFC item in the Catalog pane, click to expand the datasets, and add the dataset to the map.

MFC datasets have a simplified experience in your map and have the following limitations:

  • When visualizing MFC datasets, the time properties in the MFC dataset properties are not automatically set in the new layer. To visualize the dataset with time, set the layer's time properties after adding the dataset to the map.
  • Drawing delimited files will zoom to the full extent of the MFC dataset's spatial reference.
  • If you add new records to an existing MFC dataset—for example, adding new rows to a CSV file in an existing MFC—the new records will not draw until you restart ArcGIS Pro.
  • If you add new files to an existing MFC dataset—for example, adding a new CSV file to an existing MFC dataset—the new records will not draw until you restart ArcGIS Pro.

Use MFC datasets in analysis

When MFC datasets are used as input to GeoAnalytics Desktop tools, analysis is optimized to read the data and run in parallel across the cores of your machine. For all other geoprocessing tools, MFC dataset reading and processing is not optimized to run in parallel, rather it is sequential and single-threaded.

You can use MFC datasets based on delimited files or shapefiles in most geoprocessing tools.

Note:
MFC datasets using parquet and ORC source files can only be used in GeoAnalytics Desktop tools.

You cannot apply a selection to a MFC dataset when it's used as input to a GeoAnalytics Desktop tool.

To use a MFC dataset in a geoprocessing tool, add a MFC dataset to a map and select the layer name from the parameter choice list or use the browse button to browse to a MFC workspace and select the input dataset. The following tools do not support input MFC files:

  • Service-based tools, including GeoAnalytics Server, standard feature analysis, and ArcGIS Online analysis tools
  • Tools that modify the input dataset, such as Calculate Field and Near


In this topic
  1. Use a MFC