Set up custom data

Available with Business Analyst license.

Setting up statistical data in Business Analyst allows you to work with your data in all Business Analyst tools and workflows the same way as you would use standard Esri variables. For example, if you set up your Sales by Census Tract data, you will be able to:

In order to set up your statistical data in Business Analyst, you have to define how your data is going to be apportioned and aggregated. You can do that by building a Statistical Data Collection (SDCX). An .sdcx file is a set of variables that you create from a polygonal feature class or from a legacy Business Analyst data source (BDS) file. You can save a Statistical Data Collection file locally and share it to your organization through ArcGIS Online. A Statistical Data Collection that is shared with ArcGIS Online or ArcGIS Enterprise can be used in Business Analyst Web App. When using custom data locally, creating an SDCX index is optimal for performance. The SDCX index is a performance index that enables the application to process the variables from a local Statistical Data Collection faster.

As part of creating and editing a Statistical Data Collection, you can create custom script-based variables that can combine your data with standard variables. For instance, a nonprofit could combine their own data regarding grants with a Business Analyst at-risk population variable to calculate available per-capita funding. Such variables can be used as a valuable criteria in workflows such as performing a suitability analysis.

The variables from a Statistical Data Collection in the ArcGIS Pro project become immediately available in the data browser for workflows such as enriching a layer, using the infographic editor, performing a suitability analysis, and other tools and workflows.

Create a Statistical Data Collection

A Statistical Data Collection is an .sdcx file that stores the path to a source dataset, information about aggregation and apportionment, and other custom data properties such as calculated variables. By default, the .sdcx file is stored in the project home folder. It is linked to the active Business Analyst dataset and references its data for apportionment and aggregation. The list of available apportionment methods can vary based on the data source. For example, for the US dataset, the list of these apportionment methods is defined by the census block centroid point layer.

Note:

It is possible to create and use a Statistical Data Collection without the locally installed dataset using data apportionment layers. Choose either of the following:

  • Use solely Geometric apportionment.
  • Provide the custom apportionment layer.

To learn more about apportionment methods, see Understand data apportionment.

If you are migrating from ArcMap, you might have already set up your custom data as a Business Analyst data source (BDS) file. You can create a Statistical Data Collection directly from the BDS file.

To create a Statistical Data Collection, do the following:

  1. Optionally, set the data source by doing the following:
    1. On the Analysis tab, click Business Analysis to open the gallery, and click Change data source.

      The Business Analyst Data Source window appears.

    2. Click the Computer folder and choose from the installed datasets—for example, USA_ESRI_2023—or choose the Custom Data folder option if you don't have any local datasets. Click OK.

      Your Statistical Data Collection is based on your own data, but it also uses some information from the Statistical Data Collection dataset. It's important to select the correct data source before creating the Statistical Data Collection because of the following:

      • If the Business Analyst dataset is uninstalled or not available, the .sdcx might become inactive.
      • Business Analyst will attempt to update it to another version of the same country dataset when possible. For instance, the user can create the SDCX using the US 2022 dataset, but then uninstall it when there is newer data available. After installing the US 2023 dataset, the data collection will continue to use the US 2023 dataset. This will potentially change the resulting values because the apportionment layer in US 2023 might be different.

  2. On the Analysis tab, click Business Analysis to open the gallery, and click New Statistical Data Collection New Statistical Data Collection.

    The Create Statistical Data Collection window appears.

  3. To create a Statistical Data Collection from a polygon feature class, do the following:
    1. Use the Input Data drop-down menu or click the folder button to browse then select an input feature class. Click OK.
    2. Use the Output Statistical Data Collection drop-down menu or click the folder button to select the output Statistical Data Collection. Click OK.
    3. Click Create.
  4. To create a Statistical Data Collection from a Business Analyst data source (BDS), do the following:
    1. Click the Import Legacy Data (BDS) button.
    2. Use the Input Business Analyst Datasource (BDS) drop-down menu or click the folder button to browse to and select an input BDS. If the BDS contains dynamic joins or calculations, Business Analyst will automatically create a feature class, which will be the input for the new Statistical Data Collection. Click OK.
    3. Click Import.

    The Statistical Data Collection editor window appears. It has the following tabs:

    • Source—This tab contains general information about the data, including the performance index status.
    • Variables—This tab contains general information for variable settings, such as Precision, Field Format, and Apportionment Method.
    • Properties—This tab allows you to edit the properties of the Statistical Data Collection, including Title, Data Vintage, Author, and Tags. You can also select a custom icon for the Statistical Data Collection.

  5. Optionally, click the Build Index button on the Source tab to build a performance index.

    Building a performance index is not required but highly recommended to ensure the best performance while using custom data in workflows.

    If you created the Statistical Data Collection from a BDS, it may contain a previously built performance index. In that case, you are prompted to build an equivalent index for the Statistical Data Collection. Click Yes to create the index.

  6. Click OK to save all changes.

    The Edit Statistical Data Collection window closes. The Catalog pane appears, and the newly created Statistical Data Collection is displayed.

  7. Optionally, right-click the Statistical Data Collection and click Edit to make further edits to your Statistical Data Collection.

Edit variables

Editing the variable properties can be done using the Variables tab of the Statistical Data Collection editor window. You can change the properties of the physical variables, such as apportionment method, summary type, or precision. You can add additional variables from the data browser, borrowing the variables from your local Business Analyst dataset. In this case, your Statistical Data Collection will have just the references to the standard variables, rather than a physical copy of the corresponding data. You can also create custom calculated variables.

Editing the physical variables can be performed directly in the variables grid or using the Properties window.

To set up properties for one or more variables, do the following:

  1. Right-click the selected variables and click the Properties option.
  2. Modify any of the field properties, such as the following:
    • Use the Summary Type drop-down menu to change the type to sum or average.
    • Use the Apportionment Method drop-down menu to change the method to geometric (GEOM) or none.
    • Use the Field Format drop-down menu to change the format to count, percentage, or currency.
    • Use the Precision field to define the number of decimal places to the right of the decimal point.
    • The Category field shows the category of the variable. If it is a variable from the data browser, the category is Default.
    • The Alias field is the name of the variable in the data browser.
    • The Vintage field displays the data vintage.
  3. Click OK.

    The properties of the selected variables are modified and saved to the Statistical Data Collection.

To add variables from the data browser, do the following:

  1. For Variables, click the Add button Add to open the data browser and add variables.

    From the data browser, you can search and browse for data by category, from custom data, saved variable lists, and from data you have saved as favorites.

  2. Click OK to add the variables.

    The variables are added to the Statistical Data Collection.

Calculated variables are new fields you can add to a Statistical Data Collection to use custom expressions for analysis. The functionality of calculated variables can be used to create a new variable from existing Statistical Data Collection variables, third-party data, or some combination of the two. Some examples include creating custom per-capita variables, as well as density variables like sales density or household density. Custom expressions can be a simple calculation based on a single field, or combinations of two or more existing fields, or even a complex Python script. You can import or export calculated variables as a label expression file (.lxp). To learn more about the fundamentals of field calculations, see Calculate Field Python examples.

  1. On the View tab, click the Catalog Pane button.

    The Catalog pane appears.

  2. On the Project tab, right-click the Statistical Data Collection and click Edit.

    The Statistical Data Collection editor window appears.

  3. To create a calculated variable, do the following:
    1. Click the Add Calculation button.

      The Calculate Variable window appears.

    2. Build a calculation in the Expression field.

      Note:

      Custom calculations and scripts support double, string, and integer field types. For example, specifying a string field allows flexibility to output textual descriptions, such as displaying a Dominant Tapestry Name, attribute properties, or a street name. To learn more, see ArcGIS field data types.

    3. Click the Verify button Verify to verify the expression.
    4. Use the Name parameter to enter the field name.
    5. Use the Title parameter to enter the alias.
    6. Click OK.

      The new Calculated Variable field is added to the Statistical Data Collection.

    7. Optionally, select the field and click Edit Calculation or right-click the field and click Edit Calculation to edit the calculated variable.
  4. Click OK.
  5. Optionally, right-click the field and click Properties to modify any of the field properties, such as Vintage, Field Format, or Category.
  6. Optionally, select the field and click Delete to remove the calculated variable.

Create SDCX indexes

You can create Statistical Data Collection files with indexes to improve the performance of analysis. Statistical Data Collection (.sdcx) files can contain references to standard Esri data variables for use in geoprocessing tools, such as Enrich Layer. When using custom data, building an index is recommended for optimal performance.

To create SDCX indexes, do the following:

  1. On the Analysis tab, in the Geoprocessing group, click Tools.

    The Geoprocessing pane appears.

  2. On the Toolboxes tab, in the Business Analyst Tools section, expand the Statistical Data Collections toolset and click Generate SDCX Index.

    The Generate SDCX Index tool opens in the Geoprocessing pane.

  3. Click the Browse button to open the Input SDCX File dialog box and choose an existing .sdcx file from other projects or folders.
  4. Click Run.

Geoprocessing tools

The custom data workflow uses the Generate SDCX Index tool to create a performance index. You can use this geoprocessing tool to build an SDCX performance index directly in ArcGIS Pro, or as part of a Python script or a model.

You can use Statistical Data Collection variables in other geoprocessing workflows, such as the following:

Related topics