R/evaluate_predictions.R
evaluate_covid_predictions.Rd
Evaluates the performance of a covid forecaster, through the following steps:
Takes a prediction card (as created by
get_predictions()
).
Downloads from the COVIDcast API the latest available data to compute what actually occurred (summing the response over the incidence period).
Computes various user-specified error measures.
Backfill refers to the process by which some data sources go back in time
updating previously reported values. Suppose it is September 14 and we are
evaluating our predictions for what happened in the previous epiweek
(September 6 through 12). Although we may be able to calculate a value for
"actual", we might not trust this value since on September 16, backfill may
occur changing what is known about the period September 6 through 12. There
are two consequences of this phenomenon. First, running this function on
different dates may result in different estimates of the error. Second, we
may not trust the evaluations we get that are too recent. The parameter
backfill_buffer
specifies how long of a buffer period we should
enforce. This will be dependent on the data source and signal and is left
to the user to determine. If backfill is not relevant for the particular
signal you are predicting, then you can set backfill_buffer
to 0.
evaluate_covid_predictions( predictions_cards, err_measures = list(wis = weighted_interval_score, ae = absolute_error, coverage_80 = interval_coverage(coverage = 0.8)), backfill_buffer = 0, geo_type = c("county", "hrr", "msa", "dma", "state", "hhs", "nation") )
predictions_cards | tibble of quantile forecasts, which contains at
least |
---|---|
err_measures | Named list of one or more functions, where each function
takes a data frame with three columns |
backfill_buffer | How many days until response is deemed trustworthy enough to be taken as correct? See details for more. |
geo_type | String indicating geographical type, such as "county", or "state". See the COVIDcast Geographic Coding documentation for available options. |
tibble of "score cards". Contains the same information as the
predictions_cards()
with additional columns for each err_measure
and
for the truth (named actual
).