Skip to contents

Constructs a list of arguments for arx_classifier().

Usage

arx_class_args_list(
  lags = c(0L, 7L, 14L),
  ahead = 7L,
  n_training = Inf,
  forecast_date = NULL,
  target_date = NULL,
  outcome_transform = c("growth_rate", "lag_difference"),
  breaks = 0.25,
  horizon = 7L,
  method = c("rel_change", "linear_reg"),
  log_scale = FALSE,
  additional_gr_args = list(),
  nafill_buffer = Inf,
  check_enough_data_n = NULL,
  check_enough_data_epi_keys = NULL,
  ...
)

Arguments

lags

Vector or List. Positive integers enumerating lags to use in autoregressive-type models (in days). By default, an unnamed list of lags will be set to correspond to the order of the predictors.

ahead

Integer. Number of time steps ahead (in days) of the forecast date for which forecasts should be produced.

n_training

Integer. An upper limit for the number of rows per key that are used for training (in the time unit of the epi_df).

forecast_date

Date. The date on which the forecast is created. The default NULL will attempt to determine this automatically.

target_date

Date. The date for which the forecast is intended. The default NULL will attempt to determine this automatically.

outcome_transform

Scalar character. Whether the outcome should be created using growth rates (as the predictors are) or lagged differences. The second case is closer to the requirements for the 2022-23 CDC Flusight Hospitalization Experimental Target. See the Classification Vignette for details of how to create a reasonable baseline for this case. Selecting "growth_rate" (the default) uses epiprocess::growth_rate() to create the outcome using some of the additional arguments below. Choosing "lag_difference" instead simply uses the change from the value at the selected horizon.

breaks

Vector. A vector of breaks to turn real-valued growth rates into discrete classes. The default gives binary upswing classification as in McDonald, Bien, Green, Hu, et al.. This coincides with the default trainer = parsnip::logistic_reg() argument in arx_classifier(). However, multiclass classification is also supported (e.g. with breaks = c(-.2, .25)) provided that trainer = parsnip::multinom_reg() (or another multiclass trainer) is used as well. These will be sliently expanded to cover the entire real line (so the default will become breaks = c(-Inf, .25, Inf)) before being used to discretize the response. This is different than the behaviour in recipes::step_cut() which creates classes that only cover the range of the training data.

horizon

Scalar integer. This is passed to the h argument of epiprocess::growth_rate(). It determines the amount of data used to calculate the growth rate.

method

Character. Options available for growth rate calculation.

log_scale

Scalar logical. Whether to compute growth rates on the log scale.

additional_gr_args

List. Optional arguments controlling growth rate calculation. See epiprocess::growth_rate() and the related Vignette for more details.

nafill_buffer

At predict time, recent values of the training data are used to create a forecast. However, these can be NA due to, e.g., data latency issues. By default, any missing values will get filled with less recent data. Setting this value to NULL will result in 1 extra recent row (beyond those required for lag creation) to be used. Note that we require at least min(lags) rows of recent data per geo_value to create a prediction. For this reason, setting nafill_buffer < min(lags) will be treated as additional allowed recent data rather than the total amount of recent data to examine.

check_enough_data_n

Integer. A lower limit for the number of rows per epi_key that are required for training. If NULL, this check is ignored.

check_enough_data_epi_keys

Character vector. A character vector of column names on which to group the data and check threshold within each group. Useful if training per group (for example, per geo_value).

...

Space to handle future expansions (unused).

Value

A list containing updated parameter choices with class arx_clist.

Examples

arx_class_args_list()
#>  lags : 0, 7, and 14
#>  ahead : 7
#>  n_training : Inf
#>  breaks : -Inf, 0.25, and Inf
#>  forecast_date : "NULL"
#>  target_date : "NULL"
#>  outcome_transform : "growth_rate"
#>  max_lags : 14
#>  horizon : 7
#>  method : "rel_change"
#>  log_scale : FALSE
#>  additional_gr_args : "_empty_"
#>  nafill_buffer : Inf
#>  check_enough_data_n : "NULL"
#>  check_enough_data_epi_keys : "NULL"

# 3-class classsification,
# also needs arx_classifier(trainer = parsnip::multinom_reg())
arx_class_args_list(breaks = c(-.2, .25))
#>  lags : 0, 7, and 14
#>  ahead : 7
#>  n_training : Inf
#>  breaks : -Inf, -0.2, 0.25, and Inf
#>  forecast_date : "NULL"
#>  target_date : "NULL"
#>  outcome_transform : "growth_rate"
#>  max_lags : 14
#>  horizon : 7
#>  method : "rel_change"
#>  log_scale : FALSE
#>  additional_gr_args : "_empty_"
#>  nafill_buffer : Inf
#>  check_enough_data_n : "NULL"
#>  check_enough_data_epi_keys : "NULL"