Train Time Series Forecasting Model (GeoAI)

Summary

Trains a deep learning-based time series forecasting model using time series data from a space-time cube. The trained model can be used for forecasting the values of each location of a space-time cube using the Forecast Using Time Series Model tool.

Time series data can follow various trends and have multiple levels of seasonality. Traditional time series forecasting models based on statistical approaches perform differently depending on the trend and patterns of seasonality in the data. Deep learning-based models have a high capacity to learn and can provide results across different kinds of time series, provided there is enough training data.

This tool trains time series forecasting models using various deep learning-based models, such as Fully Connected Network (FCN), Long Short-Term Memory (LSTM), InceptionTime, ResNet, and ResCNN. These models support multivariate time series, in which the model learns from more than one time dependent variable to forecast future values. The trained model is saved as a deep learning package file (.dlpk) and can be used for forecasting future values using the Forecast Using Time Series Model tool.

Learn more about how Time Series Forecasting Models work

Usage

  • You must install the proper deep learning framework for Python in ArcGIS Pro.

    Learn how to install deep learning frameworks for ArcGIS

  • This tool accepts netCDF data created by the Create Space Time Cube By Aggregating Points, Create Space Time Cube From Defined Locations, Create Space Time Cube from Multidimensional Raster Layer, and Subset Space Time Cube tools.

  • Compared to other forecasting tools in the Time Series Forecasting toolset, this tool uses deep learning-based time series forecasting models. Deep learning models have a high capacity to learn and are appropriate for time series that follow complex trends and are difficult to model with simple mathematical functions. However, they require a larger volume of training data to learn such complex trends and use more computational resources for training and inference. A GPU is recommended for using this tool.

  • To run this tool using a GPU, set the Processor Type environment to GPU. If you have more than one GPU, specify the GPU ID environment instead.

  • This tool can be used to model both univariate and multivariate time series. If the space-time cube has other variables that are related to the variable being forecast, they can be included as explanatory variables to improve the forecast.

  • Rather than building an independent forecast model at each location of the space-time cube, this tool trains a single global forecast model that uses training data from each location. This global model will be used to forecast future values at every location using the Forecast Using Time Series Model tool.

  • The Output Features parameter value will be added to the Contents pane with rendering based on the final forecasted time step.

  • Use cases for this tool include training a model to predict demand for retail products based on historical sales data, training a model to predict the spread of diseases, or for predicting generation of wind power based on historical production and weather data.

  • Deciding how many time steps to exclude for validation is an important choice. The more time steps that are excluded, the fewer time steps there will be to estimate the validation RMSE. If too few time steps are excluded, the validation RMSE will be estimated using a small amount of data and may be misleading. Exclude as many time steps as possible while maintaining sufficient time steps to estimate the validation RMSE. Withhold at least as many time steps for validation as the number of time steps you intend to forecast if the space-time cube has enough time steps to support this.

  • For information about requirements for running this tool and issues you may encounter, see Deep Learning frequently asked questions.

Parameters

LabelExplanationData Type
Input Time Series Data

The netCDF cube containing the variable that will be used to forecast to future time steps. This file must have an .nc file extension and must have been created using the Create Space Time Cube By Aggregating Points, Create Space Time Cube From Defined Locations, or Create Space Time Cube From Multidimensional Raster Layer tool.

File
Output Model

The output folder location that will store the trained model. The trained model will be saved as a deep learning package file (.dlpk).

Folder
Analysis Variable

The numeric variable in the dataset that will be forecasted to future time steps.

String
Sequence Length

The number of previous time steps that will be used when training the model. If the data contains seasonality (repeating cycles), provide the length corresponding to one season. The parameter value cannot be larger than the total number of input time steps that remain after excluding validation time steps.

Long
Explanatory Training Variables
(Optional)

Independent variables from the data that will be used to train the model. Check the Categorical check box for any variables that represent classes or categories

Value Table
Max Epochs
(Optional)

The maximum number of epochs for which the model will be trained. The default is 20.

Long
Number Of Time Steps to Exclude for Validation
(Optional)

The number of time steps that will be excluded for validation. If a value of 14 is specified, the last 14 rows in the data frame will be used as validation data. This value cannot be larger than 25 percent of the number of input time steps. The default is 2.

Long
Model Type
(Optional)

Specifies the model architecture that will be used for training the model.

  • InceptionTime
  • ResNet
  • ResCNN
  • FCN
  • LSTM

The default model type is InceptionTime.

Learn more about how time series forecasting models work

String
Batch Size
(Optional)

The number of samples that will be processed at one time. The default is 64.

Depending on the computer's GPU, this number can be changed to 8, 16, 32, 64, and son on.

Long
Model Arguments
(Optional)

Additional model arguments that will be used specific to each model. These arguments can be used to adjust the model complexity and size. See How Time Series forecasting models work to understand the model architecture, the supported model arguments, and their default values.

Value Table
Stop training when model no longer improves
(Optional)

Specifies whether the model training will stop when validation loss does not register improvement after five consecutive epochs.

  • Checked—The model training will stop when validation loss does not register improvement after five consecutive epochs. This is the default.

  • Unchecked—The model training will continue until the maximum number of epochs has been reached.

Boolean
Output Feature Class
(Optional)

The output feature class of all locations in the space-time cube with forecasted values stored as fields. The feature class will be created using prediction of the trained model on the validation dataset. The output displays the forecast for the final time step and contains pop-up charts showing the time series forecast on the validation set.

Feature Class

arcpy.geoai.TrainTimeSeriesForecastingModel(in_cube, out_model, analysis_variable, sequence_length, {explanatory_variables}, {max_epochs}, {validation_timesteps}, {model_type}, {batch_size}, {arguments}, {early_stopping}, {out_features})
NameExplanationData Type
in_cube

The netCDF cube containing the variable that will be used to forecast to future time steps. This file must have an .nc file extension and must have been created using the Create Space Time Cube By Aggregating Points, Create Space Time Cube From Defined Locations, or Create Space Time Cube From Multidimensional Raster Layer tool.

File
out_model

The output folder location that will store the trained model. The trained model will be saved as a deep learning package file (.dlpk).

Folder
analysis_variable

The numeric variable in the dataset that will be forecasted to future time steps.

String
sequence_length

The number of previous time steps that will be used when training the model. If the data contains seasonality (repeating cycles), provide the length corresponding to one season. The parameter value cannot be larger than the total number of input time steps that remain after excluding validation time steps.

Long
explanatory_variables
[explanatory_variables,...]
(Optional)

Independent variables from the data that will be used to train the model. Use a True value after any variables that represent classes or categories.

Value Table
max_epochs
(Optional)

The maximum number of epochs for which the model will be trained. The default is 20.

Long
validation_timesteps
(Optional)

The number of time steps that will be excluded for validation. If a value of 14 is specified, the last 14 rows in the data frame will be used as validation data. This value cannot be larger than 25 percent of the number of input time steps. The default is 2.

Long
model_type
(Optional)

Specifies the model architecture that will be used for training the model.

  • InceptionTime
  • ResNet
  • ResCNN
  • FCN
  • LSTM

The default model type is InceptionTime.

String
batch_size
(Optional)

The number of samples that will be processed at one time. The default is 64.

Depending on the computer's GPU, this number can be changed to 8, 16, 32, 64, and son on.

Long
arguments
[arguments,...]
(Optional)

Additional model arguments that will be used specific to each model. These arguments can be used to adjust the model complexity and size. See How Time Series forecasting models work to understand the model architecture, the supported model arguments, and their default values.

Value Table
early_stopping
(Optional)

Specifies whether the model training will stop when validation loss does not register improvement after five consecutive epochs.

  • TRUEThe model training will stop when validation loss does not register improvement after five consecutive epochs. This is the default.
  • FALSEThe model training will continue until the maximum number of epochs has been reached.
Boolean
out_features
(Optional)

The output feature class of all locations in the space-time cube with forecasted values stored as fields. The feature class will be created using prediction of the trained model on the validation dataset. The output displays the forecast for the final time step and contains pop-up charts showing the time series forecast on the validation set.

Feature Class

Code sample

TrainTimeSeriesForecastingModel (stand-alone script)

This example shows how to use the TrainTimeSeriesForecastingModel function.


# Name: TrainTimeSeriesForecastingModel.py
# Description: Train a time series model on space-time cube data with
# different AI models.
  
# Import system modules                                                                                                                                                                                                                                                                                                                    
import arcpy
import os

# Set local variables
datapath  = "path_to_data_for_forecasting" 
out_path = "path_to_gdb_for_forecasting"

model_path = os.path.join(out_path, "model")
in_cube = os.path.join(datapath, "test_data")
out_features = os.path.join(out_path, "forecasted_feature.gdb", "forecasted")

# Run TrainTimeSeriesForecastingModel
arcpy.geoai.TrainTimeSeriesForecastingModel(
        in_cube,
        model_path,
        "CONSUMPTION",
        12,
        None,
        20,
        2,
        "InceptionTime",
        64,
        None,
        True,
        out_features
    )

Licensing information

  • Basic: No
  • Standard: No
  • Advanced: Yes

Related topics