GeoAnalytics Desktop tools provide a parallel processing framework for analysis on a desktop machine using Apache Spark. Through aggregation, regression, detection, and clustering, you can visualize, understand, and interact with big data. These tools work with big datasets and allow you to gain insight into your data through patterns, trends, and anomalies. The tools are integrated and run in ArcGIS Pro in the same way as other desktop geoprocessing tools.
GeoAnalytics Desktop tools are designed for large datasets; consequently, other desktop tools may be more appropriate for use with smaller datasets. GeoAnalytics Desktop tools require an initial startup time to implement the distributed processing, so they are optimal for larger datasets (hundreds of thousands or millions of records).
Similar to other tools in ArcGIS Pro the performance of GeoAnalytics desktop tools depends on the following:
- The size of the input data—Such as the number of features and number of fields.
- The input data source—For example, file geodatabase feature classes compared to shapefiles.
- The tool you are running—For example, Aggregate Points will complete execution quicker than Calculate Density with the same data and bin size.
- The parameters you use in the tool—For example, when using the Join Features tool, a smaller join distance will perform better than a larger one.
- The hardware on your ArcGIS Pro machine.
Considerations for data sources are discussed in the Data section below. Each GeoAnalytics Desktop tool topic includes a usage note about improving tool performance by modifying parameters.
Data
When running analysis, data that is colocated will have the best performance.
GeoAnalytics Desktop tools support the following data sources for input and output:
- Shapefiles
- File geodatabases
- Tables (such as .csv files)
Using shapefiles as input and output is may be quicker than using file geodatabases for reading and writing with GeoAnalytics Desktop tools. File geodatabases have some benefits over shapefiles for analysis though, so careful consideration should be given to your data source.
GeoAnalytics Desktop tools do not support the following data sources for input and output:
- Geopackages
- XY event layers
- Services, such as map and feature services
Analysis
GeoAnalytics Desktop tools may not work the same as other ArcGIS Pro tools.
GeoAnalytics does not include polygonal slivers in its operations. In the Join Features and Overlay Layers GeoAnalytics tools, slivers will not be included in the analysis.
GeoAnalytics Desktop tools result in less densified features than other ArcGIS Pro tools. For example, the following images depict the result of vertices on a buffered polygon using Buffer in the Analysis toolbox (blue), and Create Buffers in the GeoAnalytics Desktop toolbox (orange). The image on the left shows the buffered polygons overlayed on each other, and the image on the right shows a zoomed-in view of some of the polygon vertices.
GeoAnalytics Desktop tools do not support the in_memory workspace.
When running GeoAnalytics Desktop tools, the analysis takes place in memory. When the data being analyzed doesn't fit in memory, it is written to disk in your temp directory. If your tool isn't completing and is taking up space on your temp drive, you can change your Windows temp drive to a larger disk.
Best practices
When running analysis, it is best to analyze only the data you are interested in. You can limit the data you analyze by doing the following:
- Apply a definition query to a layer on your map.
- Apply a selection to features on your map.
- Set the processing extent of your analysis to limit the spatial extent of features used.
- Use the time slider to specify the extent of data to analyze.
When using GeoAnalytics Desktop tools, a definition query will typically process quicker than a selection.
Using time in analysis
Many GeoAnalytics Desktop tools use or require time. To take advantage of time stepping, temporal joins, or track-based analysis (for example, using Reconstruct Tracks, Find Dwell Locations, or Detect Incidents) your layers must be time enabled. Enable time on your layers by adding them to your map in ArcGIS Pro and set the time properties on data. When you sett time, verify that the Time Format and Time Extent parameters match your data. If your values don't look correct, do one of the following to correctly format your time fields:
- Use the Convert Time Field tool to modify your time fields to a supported format.
- Modify time fields when they are across separate fields.
When running analysis, you must enable time before adding your layer to the tool. If you set time after adding the layer to the tool parameter, you must add the layer again. If time is not enabled before you add the layer, you will receive a warning that time isn't enabled on the layer.
Similar to other geoprocessing tools, only features in the visible time extent will be analyzed.
To run temporal analysis on a layer using ArcPy or to share your time settings, create a layer file with your time settings and use that layer for analysis.