Skip to contents

This function calculates an approximation to a parametric predictive distribution. Predictive distributions from linear models require x* (X'X)^{-1} x* along with the degrees of freedom. This function approximates both. It should be reasonably accurate for models fit using lm when the new point x* isn't too far from the bulk of the data.

Usage

layer_predictive_distn(
  frosting,
  ...,
  dist_type = c("gaussian", "student_t"),
  truncate = c(-Inf, Inf),
  name = ".pred_distn",
  id = rand_id("predictive_distn")
)

Arguments

frosting

a frosting postprocessor

...

Unused, include for consistency with other layers.

dist_type

Gaussian or Student's t predictive intervals

truncate

Do we truncate the distribution to an interval

name

character. The name for the output column.

id

a random id string

Value

an updated frosting postprocessor with additional columns of the residual quantiles added to the prediction

Examples

library(dplyr)
jhu <- covid_case_death_rates %>%
  filter(time_value > "2021-11-01", geo_value %in% c("ak", "ca", "ny"))

r <- epi_recipe(jhu) %>%
  step_epi_lag(death_rate, lag = c(0, 7, 14)) %>%
  step_epi_ahead(death_rate, ahead = 7) %>%
  step_epi_naomit()

wf <- epi_workflow(r, linear_reg()) %>% fit(jhu)

f <- frosting() %>%
  layer_predict() %>%
  layer_predictive_distn() %>%
  layer_naomit(.pred)
wf1 <- wf %>% add_frosting(f)

p <- forecast(wf1)
p
#> An `epi_df` object, 3 x 4 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * other_keys = geo_value, time_value
#> * as_of     = 2022-05-31
#> 
#> # A tibble: 3 × 4
#>   geo_value time_value .pred
#> * <chr>     <date>     <dbl>
#> 1 ak        2021-12-31 0.245
#> 2 ca        2021-12-31 0.313
#> 3 ny        2021-12-31 0.295
#> # ℹ 1 more variable: .pred_distn <dist>