Parallel processing with Spatial Analyst

Available with Spatial Analyst license.

For some tools, Spatial Analyst offers enhanced performance with the use of parallel processing. This technology leverages the multi-core processors on modern computing hardware to complete processing tasks more quickly.

Following is a list of tools, by toolset, that currently support parallel processing:

  • Density:

    Calculate Kernel Density Ratio, Kernel Density

  • Distance:

    Distance Accumulation, Distance Allocation, Least Cost Corridor

  • Distance (Legacy):

    Cost Allocation, Cost Back Link, Cost Distance, Euclidean Allocation, Euclidean Back Direction, Euclidean Direction, Euclidean Distance, Path Distance, Path Distance Allocation, Path Distance Back Link

  • Extraction:

    Sample

  • Generalization:

    Aggregate, Boundary Clean, Expand, Nibble, Shrink

  • Hydrology:

    Fill, Flow Accumulation, Flow Direction, Flow Distance, Sink, Storage Capacity, Stream Link, Watershed

  • Multidimensional Analysis:

    Aggregate Multidimensional Raster, Generate Multidimensional Anomaly

  • Neighborhood:

    Focal Statistics

  • Overlay:

    Weighted Overlay, Weighted Sum

  • Reclass:

    Reclassify, Slice

  • Segmentation and Classification:

    Classify Raster, Compute Segment Attributes, Export Training Data For Deep Learning, Inspect Training Samples, Linear Spectral Unmixing, Remove Raster Segment Tiling Artifacts, Segment Mean Shift, Train ISO Cluster Classifier, Train Support Vector Machine Classifier

  • Surface:

    Contour, Contour List, Geodesic Viewshed, Surface Parameters

  • Zonal:

    Zonal Statistics, Zonal Statistics as Table

What is parallel processing?

In parallel processing, a computing task is broken up into smaller portions, which are then sent to the available computing cores for processing. The results for all the separate operations are reassembled by the software into final result, usually in less time than it would take for a single core to process the entire dataset by itself.

Most modern computers have multi-core CPUs. A multi-core chip is one in which each individual physical CPU in a computer has several logical processors on the same silicon die. Microprocessors typically have 2, 4, 8, or more cores per processor, and on occasion can have 6 or 12 cores per processor. Some computers have multiple CPUs, so the total number of cores available on a system is the number of cores per CPU times the number of CPUs on the main logic board.

Controlling parallel processing with environments

For the tools that support parallel processing, the general behavior is to use 50 percent of the available processing cores by default. There are some variations between tools, so be sure to check the usage notes for each tool carefully.

You can use the Parallel Processing factor environment for control over the number of processors that can be applied to an operation.

There are some dependencies on the size of the data being processed. With most tools, parallel processing will automatically be enabled when the input rasters are more than 5K x 5K rows and columns in size. Inputs below this size may not see an appreciable improvement in performance due to the computational cost of splitting up the input and starting up the parallel processing technology. You can override this behavior by specifying a value for the environment.

The TempFolders system environment

Some tools use a Windows system environment variable to control where the temporary data goes while the parallel processing is happening. Once you open the System Properties, click the Advanced tab, then Environment Variables. Click New to open the New System Variable dialog box. Enter TempFolders for the Variable Name. For the Variable value, specify the path to a local folder where the temporary data will be written to. Click OK when you are done. You may need to restart your machine for the change to take effect.

Note:

Some details may vary based on your exact version of the Microsoft Windows operating system. Consult your System Administrator for assistance.

List of tools:

  • Distance: Cost Allocation, Cost Back Link, Cost Distance, Path Distance, Path Distance Allocation, Path Distance Back Link
  • Generalization: Nibble
  • Hydrology: Fill, Flow Accumulation, Flow Direction, Flow Distance, Sink, Stream Link, Watershed

Maximizing performance with SSD

You can improve performance by utilizing solid-state drives (SSD) in your computer. Maximum performance is usually gained by having the input data, the output being generated, and the temporary data on an SSD, compared with having them on physical hard disk drives (HDD). However, because these devices are relatively expensive, and generally do not have as large a capacity, you can still gain a significant portion of the performance advantage by keeping the input data on the HDD and using an SSD for the TempFolders only.

Related topics