Create Space Time Cube (GeoAnalytics)

Summary

Summarizes a set of points into a netCDF data structure by aggregating them into space-time bins. Within each bin, the points are counted, and specified attributes are aggregated. For all bin locations, the trend for counts and summary field values are evaluated.

Illustration

Space Time Cube Creation

Usage

  • This tool aggregates Point Layer features into space-time bins. The data structure it creates can be thought of as a three-dimensional cube composed of space-time bins with the x and y dimensions representing space and the t dimension representing time.

    Space-time bins in a three-dimensional cube

  • Every bin has a fixed position in space (x,y) and in time (t). Bins covering the same (x, y) area share the same location ID. Bins encompassing the same duration share the same time-step ID. Because the cube is always rectangular even if the point data is not, some locations will have point counts of zero for all time steps. For many analyses, only locations with data—with at least one point count greater than 1 for at least one time step—will be included in the analysis.

    Locations in the space-time cube

  • Each bin in the space-time cube has LOCATION_ID, time_step_ID, and COUNT field values, and values for any Summary Fields that were aggregated when the cube was created. Bins associated with the same physical location will share the same location ID and together will represent a time series. Bins associated with the same time-step interval will share the same time-step ID and together will comprise a time slice. The count value for each bin reflects the number of points that occurred at the associated location within the associated time-step interval.

  • The Point Layer parameter must be points, such as crime or fire events, disease incidents, customer sales data, or traffic accidents. Each point must have a date associated with it. The tool requires a minimum of 60 points and a variety of time stamps. The tool will fail if the parameters specified result in a cube with more than two billion bins.

  • This tool requires projected data to accurately measure distances.

  • Output from this tool is a netCDF representation of the input points. The resulting space time cube will be downloaded directly to the machine on which you run the analysis. The location will be specified in the tool messages.

  • It is not uncommon for a dataset to have a regularly spaced temporal distribution. For instance, you might have yearly data that all falls on January 1 of each year, or monthly data that is all time stamped the first of each month. This kind of data is often referred to as panel data. With panel data, temporal bias calculations will often show high percentages. This is to be expected, as each bin will only cover one particular time unit in the given time step. For instance, if you chose a 1-year Time Interval and your data fell on January 1 of each year, each bin would only cover one day out of the year. This is perfectly acceptable since it applies to each bin. Temporal bias becomes an issue when it is only present for certain bins due to bin creation parameters rather than true data distribution. It is important to evaluate the temporal bias in terms of the expected coverage in each bin based on the distribution of the data.

  • The temporal bias in the output report is calculated as the percentage of the time span that has no data present. For example, an empty bin would have 100 percent temporal bias. A bin with a 1-month time span and an end Time Interval Alignment that only has data for the second 2 weeks of the first time step would have a 50 percent first time-step temporal bias. A bin with a 1-month time span and a start Time Interval Alignment that only has data for the first 2 weeks of the last time step would have a 50 percent last time-step temporal bias.

  • Once you create a space-time cube, the spatial extent of the cube can never be extended.

  • The Reference Time parameter can be a date and time value or solely a date value; it cannot be solely a time value.

  • Use a Distance Interval that makes sense for your analysis. Find the balance between a distance interval that is so large the underlying patterns in your point data are lost, and a distance interval that is so small the cube is filled with zero counts.
  • The trend analysis performed on the aggregated count data and summary field values is based on the Mann-Kendall statistic.

  • The following statistical operations are available for the aggregation of attributes with this tool: sum, mean, minimum, maximum, and standard deviation.

  • When filling empty bins with SPATIAL_NEIGHBORS, a Queens Case Contiguity is used (contiguity based on edges and corners) of the 2nd order (includes neighbors and neighbors of neighbors). A minimum of 4 spatial neighbors are required to fill the empty bin using this option.

  • When filling empty bins with SPACE_TIME_NEIGHBORS, a Queens Case Contiguity is used (contiguity based on edges and corners) of the 2nd order (includes neighbors and neighbors of neighbors). Additionally, temporal neighbors are used for each of those bins found to be spatial neighbors by going backward and forward two time steps. A minimum of 13 space time neighbors are required to fill the empty bin using this option.

  • When filling empty bins with TEMPORAL_TREND, the first two time periods and last two time periods at a given location must have values in their bins to interpolate values at other time periods for that location.

  • Null values present in any of the summary field records will result in those features being excluded from analysis. If count of points in each bin is part of your analysis strategy, consider creating separate cubes, one for the count (without Summary Fields) and one for Summary Fields. If the set of null values is different for each summary field, consider creating a separate cube for each summary field.

  • This geoprocessing tool is powered by ArcGIS GeoAnalytics Server. Analysis is completed on your GeoAnalytics Server, and results are stored in your content in ArcGIS Enterprise.

  • When running GeoAnalytics Server tools, the analysis is completed on the GeoAnalytics Server. For optimal performance, make data available to the GeoAnalytics Server through feature layers hosted on your ArcGIS Enterprise portal or through big data file shares. Data that is not local to your GeoAnalytics Server will be moved to your GeoAnalytics Server before analysis begins. This means that it will take longer to run a tool, and in some cases, moving the data from ArcGIS Pro to your GeoAnalytics Server may fail. The threshold for failure depends on your network speeds, as well as the size and complexity of the data. Therefore, it is recommended that you always share your data or create a big data file share.

    Learn more about sharing data to your portal

    Learn more about creating a big data file share through Server Manager

  • Similar analysis can also be completed using the following:

Syntax

arcpy.geoanalytics.CreateSpaceTimeCube(point_layer, output_name, distance_interval, time_step_interval, {time_step_interval_alignment}, {reference_time}, {summary_fields})
ParameterExplanationData Type
point_layer

The input point feature class that will be aggregated into space-time bins.

Feature Set
output_name

The output netCDF data cube that will be created to contain counts and summaries of the input feature point data.

String
distance_interval

The distance that will determine the bin size.

The size of the bins will be used to aggregate the point_layer. All points that fall within the same distance_interval and time_step_interval will be aggregated.

Linear Unit
time_step_interval

The number of seconds, minutes, hours, days, weeks, or years that will represent a single time step. All points within the same time_step_interval and distance_interval will be aggregated. Examples of valid entries for this parameter are 1 Weeks, 13 Days, or 1 Months.

Time Unit
time_step_interval_alignment
(Optional)

Specifies how aggregation will occur based on the Time Interval (time_step_interval in Python) parameter.

  • END_TIMETime steps will align to the last time event and aggregate back in time.
  • START_TIMETime steps will align to the first time event and aggregate forward in time.
  • REFERENCE_TIMETime steps will align to a specified date or time. If all points in the input features have a time stamp larger than the specified reference time (or it falls exactly on the start time of the input features), the time-step interval will begin with that reference time and aggregate forward in time (as occurs with the Start time alignment). If all points in the input features have a time stamp smaller than the specified reference time (or it falls exactly on the end time of the input features), the time-step interval will end with that reference time and aggregate backward in time (as occurs with the End time alignment). If the specified reference time is in the middle of the time extent of the data, a time-step interval will be created ending with the reference time provided (as occurs with the End time alignment); additional intervals will be created both before and after the reference time until the full time extent of the data is covered.
String
reference_time
(Optional)

The date or time that will be used to align the time-step intervals. For example, to bin the data weekly, Monday to Sunday, set a reference time of Sunday at midnight to ensure that bins break between Sunday and Monday at midnight.

Date
summary_fields
[summary_fields,...]
(Optional)

The numeric field containing attribute values that will be used to calculate the specified statistic when aggregating into a space time cube. Multiple statistic and field combinations can be specified. Null values are excluded from all statistical calculations.

Available statistic types are the following:

  • Sum—Adds the total value for the specified field within each bin.
  • Mean—Calculates the average for the specified field within each bin.
  • Minimum—Finds the smallest value for all records of the specified field within each bin.
  • Maximum—Finds the largest value for all records of the specified field within each bin.
  • Standard deviation—Finds the standard deviation on values in the specified field within each bin.

Available fill types are the following:

  • Zeros—Fills empty bins with zeros.
  • Spatial_Neighbors—Fills empty bins with the average value of spatial neighbors.
  • Space Time Neighbors—Fills empty bins with the average value of space-time neighbors.
  • Temporal Trend—Fills empty bins using an interpolated univariate spline algorithm.

Note:
Null values present in any of the summary fields will result in those features being excluded from the analysis. If count of points in each bin is part of your analysis strategy, consider creating separate cubes, one for the count (without summary fields) and one for summary fields. If the set of null values is different for each summary field, consider creating a separate cube for each summary field.

Value Table

Derived Output

NameExplanationData Type
output

The aggregated space time cube.

File

Code sample

CreateSpaceTimeCube (Python window)

The following Python window script demonstrates how to use the CreateSpaceTimeCube tool.

#-------------------------------------------------------------------------------
# Name: CreateSpaceTimeCube.py
# Description: Create a cube representing the counts of Crimes

# Requirements: ArcGIS GeoAnalytics Server

# Import system modules
import arcpy

# Set local variables
inFeatures = "https://MyGeoAnalyticsMachine.domain.com/geoanalytics/rest/services/DataStoreCatalogs/bigDataFileShares_Crimes/BigDataCatalogServer/Chicago"
outCube = "CrimeCube.nc"

# Execute Create Space Time Cube
arcpy.geoanalytics.CreateSpaceTimeCube(inFeatures, outCube, "1 Kilometers", 
                                       "1 Weeks", "START_TIME")

Environments

Output Coordinate System

The coordinate system that will be used for analysis. Analysis will be completed in the input coordinate system unless specified by this parameter. For GeoAnalytics Tools, final results will be stored in the spatiotemporal data store in WGS84.

Licensing information

  • Basic: Requires ArcGIS GeoAnalytics Server
  • Standard: Requires ArcGIS GeoAnalytics Server
  • Advanced: Requires ArcGIS GeoAnalytics Server

Related topics