Create a score card data frame — evaluate

Evaluates the performance of a forecaster, through the following steps:

Takes a prediction card (as created by get_predictions()).
Computes various user-specified error measures.

The result is a "score card" data frame, where each row corresponds to a prediction-result pair, where the columns define the prediction task, give the observed value, and give the calculated values of the provided error measures.

evaluate_predictions(
  predictions_cards,
  truth_data,
  err_measures = list(wis = weighted_interval_score, ae = absolute_error, coverage_80 =
    interval_coverage(coverage = 0.8)),
  grp_vars = c("forecaster", intersect(colnames(predictions_cards),
    colnames(truth_data)))
)

Arguments

predictions_cards	tibble of quantile forecasts, which contains at least `quantile` and `value` columns, as well as any other prediction task identifiers. For covid data, a predictions card may be created by the function `get_predictions()`, downloaded with `get_covidhub_predictions()` or created manually.
truth_data	truth data (observed). This should be a data frame that will be joined to `predictions_cards` by all available columns. The observed data column should be named `actual`.
err_measures	Named list of one or more functions, where each function takes a data frame with three columns `quantile`, `value` and `actual` (i.e., observed) and returns a scalar measure of error. Null or an empty list may be provided if scoring is not desired.
grp_vars	character vector of named columns in `predictions_cards` such that the combination gives a unique (quantile) prediction.

Value

tibble of "score cards". Contains the same information as the predictions_cards() with additional columns for each err_measure and for the truth (named actual).