AutoTSEstimator (experimental)

AutoTSEstimator (experimental)

Automated TimeSeries Estimator for time series forecasting task. AutoTSEstimator will replace AutoTSTrainer in later version.

class zoo.chronos.autots.experimental.autotsestimator.AutoTSEstimator(model='lstm', search_space={}, metric='mse', loss=None, optimizer='Adam', past_seq_len=2, future_seq_len=1, input_feature_num=None, output_target_num=None, selected_features='auto', backend='torch', logs_dir='/tmp/autots_estimator', cpus_per_trial=1, name='autots_estimator')[source]

Bases: object

Automated TimeSeries Estimator for time series forecasting task, which supports TSDataset and customized data creator as data input on built-in model (only “lstm”, “tcn” for now) and 3rd party model.

Only backend=”torch” is supported for now. Customized data creator has not been fully supported by TSPipeline.

>>> # Here is a use case example:
>>> # prepare train/valid/test tsdataset
>>> autoest = AutoTSEstimator(model="lstm",
>>>                           search_space=search_space,
>>>                           past_seq_len=6,
>>>                           future_seq_len=1)
>>> tsppl = autoest.fit(data=tsdata_train,
>>>                     validation_data=tsdata_valid)
>>> tsppl.predict(tsdata_test)
>>> tsppl.save("my_tsppl")

AutoTSEstimator trains a model for time series forecasting. Users can choose one of the built-in models, or pass in a customized pytorch or keras model for tuning using AutoML.

Parameters
  • model – a string or a model creation function. A string indicates a built-in model, currently “lstm”, “tcn” are supported. A model creation function indicates a 3rd party model, the function should take a config param and return a torch.nn.Module (backend=”torch”) / tf model (backend=”keras”). If you use chronos.data.TSDataset as data input, the 3rd party should have 3 dim input (num_sample, past_seq_len, input_feature_num) and 3 dim output (num_sample, future_seq_len, output_feature_num) and use the same key in the model creation function. If you use a customized data creator, the output of data creator should fit the input of model creation function.

  • search_space – hyper parameter configurations. Read the API docs for each auto model. Some common hyper parameter can be explicitly set in named parameter. search_space should contain those parameters other than the keyword arguments in this constructor in its key.

  • metric – String. The evaluation metric name to optimize. e.g. “mse”

  • loss – String or pytorch/tf.keras loss instance or pytorch loss creator function. The default loss function for pytorch backend is nn.MSELoss().

  • optimizer – String or pyTorch optimizer creator function or tf.keras optimizer instance.

  • past_seq_len – Int or or hp sampling function. The number of historical steps (i.e. lookback) used for forecasting. For hp sampling, see zoo.orca.automl.hp for more details. The values defaults to 2.

  • future_seq_len – Int. The number of future steps to forecast. The value defaults to 1.

  • input_feature_num – Int. The number of features in the input. The value is ignored if you use chronos.data.TSDataset as input data type.

  • output_target_num – Int. The number of targets in the output. The value is ignored if you use chronos.data.TSDataset as input data type.

  • selected_features – String. “all” and “auto” are supported for now. For “all”, all features that are generated are used for each trial. For “auto”, a subset is sampled randomly from all features for each trial. The parameter is ignored if not using chronos.data.TSDataset as input data type. The value defaults to “auto”.

  • backend – The backend of the auto model. We only support backend as “torch” for now.

  • logs_dir – Local directory to save logs and results. It defaults to “/tmp/autots_estimator”

  • cpus_per_trial – Int. Number of cpus for each trial. It defaults to 1.

  • name – name of the autots estimator. It defaults to “autots_estimator”.

fit(data, epochs=1, batch_size=32, validation_data=None, metric_threshold=None, n_sampling=1, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None)[source]

fit using AutoEstimator

Parameters
  • data – train data. For backend of “torch”, data can be a TSDataset or a function that takes a config dictionary as parameter and returns a PyTorch DataLoader. For backend of “keras”, data can be a TSDataset.

  • epochs – Max number of epochs to train in each trial. Defaults to 1. If you have also set metric_threshold, a trial will stop if either it has been optimized to the metric_threshold or it has been trained for {epochs} epochs.

  • batch_size – Int or hp sampling function from an integer space. Training batch size. It defaults to 32.

  • validation_data – Validation data. Validation data type should be the same as data.

  • metric_threshold – a trial will be terminated when metric threshold is met.

  • n_sampling – Number of times to sample from the search_space. Defaults to 1. If hp.grid_search is in search_space, the grid will be repeated n_sampling of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.

  • search_alg – str, all supported searcher provided by ray tune (i.e.”variant_generator”, “random”, “ax”, “dragonfly”, “skopt”, “hyperopt”, “bayesopt”, “bohb”, “nevergrad”, “optuna”, “zoopt” and “sigopt”)

  • search_alg_params – extra parameters for searcher algorithm besides search_space, metric and searcher mode

  • scheduler – str, all supported scheduler provided by ray tune

  • scheduler_params – parameters for scheduler

Returns

a TSPipeline with the best model.

get_best_config()[source]

Get the best configuration

Returns

A dictionary of best hyper parameters

TSPipeline (experimental)

TSPipeline is an E2E solution for time series forecasting task. AutoTSEstimator will replace original TSPipeline returned by AutoTSTrainer in later version.

class zoo.chronos.autots.experimental.tspipeline.TSPipeline(best_model, best_config, **kwargs)[source]

Bases: object

TSPipeline is an E2E solution for time series analysis (only forecasting task for now). You can use TSPipeline to:

  1. Further development on the prototype. (predict, evaluate, incremental fit)

  2. Deploy the model to their scenario. (save, load)

evaluate(data, metrics=['mse'], multioutput='uniform_average', batch_size=32)[source]

Evaluate the time series pipeline.

Parameters
  • data – data can be a TSDataset or data creator(will be supported). The TSDataset should follow the same operations as the training TSDataset used in AutoTSEstimator.fit.

  • metrics – list. The evaluation metric name to optimize. e.g. [“mse”]

  • multioutput – Defines aggregating of multiple output values. String in [‘raw_values’, ‘uniform_average’]. The value defaults to ‘uniform_average’.

  • batch_size – predict batch_size, the process will cost more time if batch_size is small while cost less memory. The param is only effective when data is a TSDataset. The values defaults to 32.

predict(data, batch_size=32)[source]

Rolling predict with time series pipeline.

Parameters
  • data – data can be a TSDataset or data creator(will be supported). The TSDataset should follow the same operations as the training TSDataset used in AutoTSEstimator.fit.

  • batch_size – predict batch_size, the process will cost more time if batch_size is small while cost less memory. The param is only effective when data is a TSDataset. The values defaults to 32.

fit(data, validation_data=None, epochs=1, metric='mse')[source]

Incremental fitting

Parameters
  • data – data can be a TSDataset or data creator(will be supported). the TSDataset should follow the same operations as the training TSDataset used in AutoTSEstimator.fit.

  • validation_data – validation data, same format as data.

  • epochs – incremental fitting epoch. The value defaults to 1.

  • metric – evaluate metric.

save(file_path)[source]

Save the TSPipeline to a folder

Parameters

file_path – the folder location to save the pipeline

static load(file_path)[source]

Load the TSPipeline to a folder

Parameters

file_path – the folder location to load the pipeline

chronos.autots.model.auto_tcn

AutoTCN is a TCN forecasting model with Auto tuning.

class zoo.chronos.autots.model.auto_tcn.AutoTCN(input_feature_num, output_target_num, past_seq_len, future_seq_len, optimizer, loss, metric, hidden_units=None, levels=None, num_channels=None, kernel_size=7, lr=0.001, dropout=0.2, backend='torch', logs_dir='/tmp/auto_tcn', cpus_per_trial=1, name='auto_tcn')[source]

Bases: object

Create an AutoTCN.

Parameters
  • input_feature_num – Int. The number of features in the input

  • output_target_num – Int. The number of targets in the output

  • past_seq_len – Int. The number of historical steps used for forecasting.

  • future_seq_len – Int. The number of future steps to forecast.

  • optimizer – String or pyTorch optimizer creator function or tf.keras optimizer instance.

  • loss – String or pytorch/tf.keras loss instance or pytorch loss creator function.

  • metric – String. The evaluation metric name to optimize. e.g. “mse”

  • hidden_units – Int or hp sampling function from an integer space. The number of hidden units or filters for each convolutional layer. It is similar to units for LSTM. It defaults to 30. We will omit the hidden_units value if num_channels is specified. For hp sampling, see zoo.orca.automl.hp for more details. e.g. hp.grid_search([32, 64]).

  • levels – Int or hp sampling function from an integer space. The number of levels of TemporalBlocks to use. It defaults to 8. We will omit the levels value if num_channels is specified.

  • num_channels – List of integers. A list of hidden_units for each level. You could specify num_channels if you want different hidden_units for different levels. By default, num_channels equals to [hidden_units] * (levels - 1) + [output_target_num].

  • kernel_size – Int or hp sampling function from an integer space. The size of the kernel to use in each convolutional layer.

  • lr – float or hp sampling function from a float space. Learning rate. e.g. hp.choice([0.001, 0.003, 0.01])

  • dropout – float or hp sampling function from a float space. Learning rate. Dropout rate. e.g. hp.uniform(0.1, 0.3)

  • backend – The backend of the TCN model. We only support backend as “torch” for now.

  • logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_tcn”

  • cpus_per_trial – Int. Number of cpus for each trial. It defaults to 1.

  • name – name of the AutoTCN. It defaults to “auto_tcn”

fit(data, epochs=1, batch_size=32, validation_data=None, metric_threshold=None, n_sampling=1, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None)[source]

Automatically fit the model and search for the best hyper parameters.

Parameters
  • data

    train data. For backend of “torch”, data can be a tuple of ndarrays or a function that takes a config dictionary as parameter and returns a PyTorch DataLoader. For backend of “keras”, data can be a tuple of ndarrays. If data is a tuple of ndarrays, it should be in the form of (x, y),

    where x is training input data and y is training target data.

  • epochs – Max number of epochs to train in each trial. Defaults to 1. If you have also set metric_threshold, a trial will stop if either it has been optimized to the metric_threshold or it has been trained for {epochs} epochs.

  • batch_size – Int or hp sampling function from an integer space. Training batch size. It defaults to 32.

  • validation_data – Validation data. Validation data type should be the same as data.

  • metric_threshold – a trial will be terminated when metric threshold is met

  • n_sampling – Number of times to sample from the search_space. Defaults to 1. If hp.grid_search is in search_space, the grid will be repeated n_sampling of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.

  • search_alg – str, all supported searcher provided by ray tune (i.e.”variant_generator”, “random”, “ax”, “dragonfly”, “skopt”, “hyperopt”, “bayesopt”, “bohb”, “nevergrad”, “optuna”, “zoopt” and “sigopt”)

  • search_alg_params – extra parameters for searcher algorithm besides search_space, metric and searcher mode

  • scheduler – str, all supported scheduler provided by ray tune

  • scheduler_params – parameters for scheduler

Returns

get_best_model()[source]

Get the best tcn model.

get_best_config()[source]

Get the best configuration

Returns

A dictionary of best hyper parameters

chronos.autots.model.auto_lstm

AutoLSTM is an LSTM forecasting model with Auto tuning.

class zoo.chronos.autots.model.auto_lstm.AutoLSTM(input_feature_num, output_target_num, past_seq_len, optimizer, loss, metric, hidden_dim=32, layer_num=1, lr=0.001, dropout=0.2, backend='torch', logs_dir='/tmp/auto_lstm', cpus_per_trial=1, name='auto_lstm')[source]

Bases: object

Create an AutoLSTM.

Parameters
  • input_feature_num – Int. The number of features in the input

  • output_target_num – Int. The number of targets in the output

  • past_seq_len – Int or hp sampling function The number of historical steps used for forecasting.

  • optimizer – String or pyTorch optimizer creator function or tf.keras optimizer instance.

  • loss – String or pytorch/tf.keras loss instance or pytorch loss creator function.

  • metric – String. The evaluation metric name to optimize. e.g. “mse”

  • hidden_dim – Int or hp sampling function from an integer space. The number of features in the hidden state h. For hp sampling, see zoo.chronos.orca.automl.hp for more details. e.g. hp.grid_search([32, 64]).

  • layer_num – Int or hp sampling function from an integer space. Number of recurrent layers. e.g. hp.randint(1, 3)

  • lr – float or hp sampling function from a float space. Learning rate. e.g. hp.choice([0.001, 0.003, 0.01])

  • dropout – float or hp sampling function from a float space. Learning rate. Dropout rate. e.g. hp.uniform(0.1, 0.3)

  • backend – The backend of the lstm model. We only support backend as “torch” for now.

  • logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_lstm”

  • cpus_per_trial – Int. Number of cpus for each trial. It defaults to 1.

  • name – name of the AutoLSTM. It defaults to “auto_lstm”

fit(data, epochs=1, batch_size=32, validation_data=None, metric_threshold=None, n_sampling=1, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None)[source]

Automatically fit the model and search for the best hyper parameters.

Parameters
  • data

    train data. For backend of “torch”, data can be a tuple of ndarrays or a function that takes a config dictionary as parameter and returns a PyTorch DataLoader. For backend of “keras”, data can be a tuple of ndarrays. If data is a tuple of ndarrays, it should be in the form of (x, y),

    where x is training input data and y is training target data.

  • epochs – Max number of epochs to train in each trial. Defaults to 1. If you have also set metric_threshold, a trial will stop if either it has been optimized to the metric_threshold or it has been trained for {epochs} epochs.

  • batch_size – Int or hp sampling function from an integer space. Training batch size. It defaults to 32.

  • validation_data – Validation data. Validation data type should be the same as data.

  • metric_threshold – a trial will be terminated when metric threshold is met

  • n_sampling – Number of times to sample from the search_space. Defaults to 1. If hp.grid_search is in search_space, the grid will be repeated n_sampling of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.

  • search_alg – str, all supported searcher provided by ray tune (i.e.”variant_generator”, “random”, “ax”, “dragonfly”, “skopt”, “hyperopt”, “bayesopt”, “bohb”, “nevergrad”, “optuna”, “zoopt” and “sigopt”)

  • search_alg_params – extra parameters for searcher algorithm besides search_space, metric and searcher mode

  • scheduler – str, all supported scheduler provided by ray tune

  • scheduler_params – parameters for scheduler

Returns

get_best_model()[source]

Get the best lstm model.

get_best_config()[source]

Get the best configuration

Returns

A dictionary of best hyper parameters

chronos.autots.model.auto_arima

AutoARIMA is an ARIMA forecasting model with Auto tuning.

class zoo.chronos.autots.model.auto_arima.AutoARIMA(p=2, q=2, seasonal=True, P=1, Q=1, m=7, metric='mse', logs_dir='/tmp/auto_arima_logs', cpus_per_trial=1, name='auto_arima', **arima_config)[source]

Bases: object

Create an automated ARIMA Model. User need to specify either the exact value or the search space of the ARIMA model hyperparameters. For details of the ARIMA model hyperparameters, refer to https://alkaline-ml.com/pmdarima/modules/generated/pmdarima.arima.ARIMA.html#pmdarima.arima.ARIMA.

Parameters
  • p – Int or hp sampling function from an integer space for hyperparameter p of the ARIMA model. For hp sampling, see zoo.chronos.orca.automl.hp for more details. e.g. hp.randint(0, 3).

  • q – Int or hp sampling function from an integer space for hyperparameter q of the ARIMA model. e.g. hp.randint(0, 3).

  • seasonal – Bool or hp sampling function from an integer space for whether to add seasonal components to the ARIMA model. e.g. hp.choice([True, False]).

  • P – Int or hp sampling function from an integer space for hyperparameter P of the ARIMA model. For hp sampling, see zoo.chronos.orca.automl.hp for more details. e.g. hp.randint(0, 3).

  • Q – Int or hp sampling function from an integer space for hyperparameter Q of the ARIMA model. e.g. hp.randint(0, 3).

  • m – Int or hp sampling function from an integer space for hyperparameter p of the ARIMA model. e.g. hp.choice([4, 7, 12, 24, 365]).

  • metric – String. The evaluation metric name to optimize. e.g. “mse”

  • logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_arima_logs”

  • cpus_per_trial – Int. Number of cpus for each trial. It defaults to 1.

  • name – name of the AutoARIMA. It defaults to “auto_arima”

  • arima_config – Other ARIMA hyperparameters.

fit(data, epochs=1, validation_data=None, metric_threshold=None, n_sampling=1, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None)[source]

Automatically fit the model and search for the best hyperparameters.

Parameters
  • data – Training data, A 1-D numpy array.

  • epochs – Max number of epochs to train in each trial. Defaults to 1. If you have also set metric_threshold, a trial will stop if either it has been optimized to the metric_threshold or it has been trained for {epochs} epochs.

  • validation_data – Validation data. A 1-D numpy array.

  • metric_threshold – a trial will be terminated when metric threshold is met

  • n_sampling – Number of times to sample from the search_space. Defaults to 1. If hp.grid_search is in search_space, the grid will be repeated n_sampling of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.

  • search_alg – str, all supported searcher provided by ray tune (i.e.”variant_generator”, “random”, “ax”, “dragonfly”, “skopt”, “hyperopt”, “bayesopt”, “bohb”, “nevergrad”, “optuna”, “zoopt” and “sigopt”)

  • search_alg_params – extra parameters for searcher algorithm besides search_space, metric and searcher mode

  • scheduler – str, all supported scheduler provided by ray tune

  • scheduler_params – parameters for scheduler

get_best_model()[source]

Get the best ARIMA model.

chronos.autots.model.auto_prophet

AutoProphet is a Prophet forecasting model with Auto tuning.

class zoo.chronos.autots.model.auto_prophet.AutoProphet(changepoint_prior_scale=0.05, seasonality_prior_scale=10.0, holidays_prior_scale=10.0, seasonality_mode='additive', changepoint_range=0.8, metric='mse', logs_dir='/tmp/auto_prophet_logs', cpus_per_trial=1, name='auto_prophet', **prophet_config)[source]

Bases: object

Create an automated Prophet Model. User need to specify either the exact value or the search space of the Prophet model hyperparameters. For details of the Prophet model hyperparameters, refer to https://facebook.github.io/prophet/docs/diagnostics.html#hyperparameter-tuning.

Parameters
  • changepoint_prior_scale – Int or hp sampling function from an integer space for hyperparameter changepoint_prior_scale for the Prophet model. For hp sampling, see zoo.chronos.orca.automl.hp for more details. e.g. hp.loguniform(0.001, 0.5).

  • seasonality_prior_scale – hyperparameter seasonality_prior_scale for the Prophet model. e.g. hp.loguniform(0.01, 10).

  • holidays_prior_scale – hyperparameter holidays_prior_scale for the Prophet model. e.g. hp.loguniform(0.01, 10).

  • seasonality_mode – hyperparameter seasonality_mode for the Prophet model. e.g. hp.choice([‘additive’, ‘multiplicative’]).

  • changepoint_range – hyperparameter changepoint_range for the Prophet model. e.g. hp.uniform(0.8, 0.95).

  • metric – String. The evaluation metric name to optimize. e.g. “mse”

  • logs_dir – Local directory to save logs and results. It defaults to “/tmp/auto_prophet_logs”

  • cpus_per_trial – Int. Number of cpus for each trial. It defaults to 1.

  • name – name of the AutoProphet. It defaults to “auto_prophet”

  • prophet_config – Other Prophet hyperparameters.

fit(data, epochs=1, validation_data=None, metric_threshold=None, n_sampling=1, search_alg=None, search_alg_params=None, scheduler=None, scheduler_params=None)[source]

Automatically fit the model and search for the best hyperparameters.

Parameters
  • data – Training data, A 1-D numpy array.

  • epochs – Max number of epochs to train in each trial. Defaults to 1. If you have also set metric_threshold, a trial will stop if either it has been optimized to the metric_threshold or it has been trained for {epochs} epochs.

  • validation_data – Validation data. A 1-D numpy array.

  • metric_threshold – a trial will be terminated when metric threshold is met

  • n_sampling – Number of times to sample from the search_space. Defaults to 1. If hp.grid_search is in search_space, the grid will be repeated n_sampling of times. If this is -1, (virtually) infinite samples are generated until a stopping condition is met.

  • search_alg – str, all supported searcher provided by ray tune (i.e.”variant_generator”, “random”, “ax”, “dragonfly”, “skopt”, “hyperopt”, “bayesopt”, “bohb”, “nevergrad”, “optuna”, “zoopt” and “sigopt”)

  • search_alg_params – extra parameters for searcher algorithm besides search_space, metric and searcher mode

  • scheduler – str, all supported scheduler provided by ray tune

  • scheduler_params – parameters for scheduler

get_best_model()[source]

Get the best Prophet model.