Skip to contents

step_growth_rate() creates a specification of a recipe step that will generate one or more new columns of derived data.

Usage

step_growth_rate(
  recipe,
  ...,
  role = "predictor",
  horizon = 7,
  method = c("rel_change", "linear_reg"),
  log_scale = FALSE,
  replace_Inf = NA,
  prefix = "gr_",
  skip = FALSE,
  id = rand_id("growth_rate"),
  additional_gr_args_list = list()
)

Arguments

recipe

A recipe object. The step will be added to the sequence of operations for this recipe.

...

One or more selector functions to choose variables for this step. See recipes::selections() for more details.

role

For model terms created by this step, what analysis role should they be assigned? lag is default a predictor while ahead is an outcome.

horizon

Bandwidth for the sliding window, when method is "rel_change" or "linear_reg". See epiprocess::growth_rate() for more details.

method

Either "rel_change" or "linear_reg", indicating the method to use for the growth rate calculation. These are local methods: they are run in a sliding fashion over the sequence (in order to estimate derivatives and hence growth rates). See epiprocess::growth_rate() for more details.

log_scale

Should growth rates be estimated using the parameterization on the log scale? See details for an explanation. Default is FALSE.

replace_Inf

Sometimes, the growth rate calculation can result in infinite values (if the denominator is zero, for example). In this case, most prediction methods will fail. This argument specifies potential replacement values. The default (NA) will likely result in these rows being removed from the data. Alternatively, you could specify arbitrary large values, or perhaps zero. Setting this argument to NULL will result in no replacement.

prefix

A character string that will be prefixed to the new column.

skip

A logical. Should the step be skipped when the recipe is baked by bake()? While all operations are baked when prep() is run, some operations may not be able to be conducted on new data (e.g. processing the outcome variable(s)). Care should be taken when using skip = TRUE as it may affect the computations for subsequent operations.

id

A unique identifier for the step

additional_gr_args_list

A list of additional arguments used by epiprocess::growth_rate(). All ... arguments may be passed here along with dup_rm and na_rm.

Value

An updated version of recipe with the new step added to the sequence of any existing operations.

See also

Other row operation steps: step_adjust_latency(), step_epi_lag(), step_lag_difference()

Examples

r <- epi_recipe(covid_case_death_rates) %>%
  step_growth_rate(case_rate, death_rate)
r
#> 
#> ── Epi Recipe ──────────────────────────────────────────────────────────────────
#> 
#> ── Inputs 
#> Number of variables by role
#> raw:        2
#> geo_value:  1
#> time_value: 1
#> 
#> ── Operations 
#> 1. Calculating growth_rate for: case_rate and death_rate by rel_change

r %>%
  prep(covid_case_death_rates) %>%
  bake(new_data = NULL)
#> An `epi_df` object, 20,496 x 6 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * as_of     = 2022-05-31
#> 
#> # A tibble: 20,496 × 6
#>    geo_value time_value case_rate death_rate gr_7_rel_change_case_rate
#>  * <chr>     <date>         <dbl>      <dbl>                     <dbl>
#>  1 ak        2020-12-31      35.9      0.158                        NA
#>  2 al        2020-12-31      65.1      0.438                        NA
#>  3 ar        2020-12-31      66.0      1.27                         NA
#>  4 as        2020-12-31       0        0                            NA
#>  5 az        2020-12-31      76.8      1.10                         NA
#>  6 ca        2020-12-31      96.0      0.751                        NA
#>  7 co        2020-12-31      35.8      0.649                        NA
#>  8 ct        2020-12-31      52.1      0.819                        NA
#>  9 dc        2020-12-31      31.0      0.601                        NA
#> 10 de        2020-12-31      64.3      0.912                        NA
#> # ℹ 20,486 more rows
#> # ℹ 1 more variable: gr_7_rel_change_death_rate <dbl>