Evaluates the performance of a forecaster, through the following steps:
Takes a prediction card (as created by
get_predictions()
).
Computes various user-specified error measures.
The result is a "score card" data frame, where each row corresponds to a prediction-result pair, where the columns define the prediction task, give the observed value, and give the calculated values of the provided error measures.
evaluate_predictions( predictions_cards, truth_data, err_measures = list(wis = weighted_interval_score, ae = absolute_error, coverage_80 = interval_coverage(coverage = 0.8)), grp_vars = c("forecaster", intersect(colnames(predictions_cards), colnames(truth_data))) )
predictions_cards | tibble of quantile forecasts, which contains at
least |
---|---|
truth_data | truth data (observed). This should be a data frame that
will be joined to |
err_measures | Named list of one or more functions, where each function
takes a data frame with three columns |
grp_vars | character vector of named columns in |
tibble of "score cards". Contains the same information as the
predictions_cards()
with additional columns for each err_measure
and
for the truth (named actual
).