Estimates derivatives of the values in a covidcast_signal
data frame, using
a local (in time) linear regression or smoothing spline. (When multiple issue
dates are present, only the latest issue is considered.) See the
estimating derivatives vignette
for examples.
estimate_deriv( x, method = c("lin", "ss", "tf"), n = 14, col_name = "deriv", keep_obj = FALSE, deriv = 1, ... )
x | The |
---|---|
method | One of "lin", "ss", or "tf" indicating the method to use for
the derivative calculation. To estimate the derivative at any time point,
we run the given method on the last |
n | Size of the local window (in days) to use. For example, if |
col_name | String indicating the name of the new column that will
contain the derivative values. Default is "deriv"; note that setting
|
keep_obj | Should the fitted object (from linear regression, smoothing
spline, or trend filtering) be kept as a separate column? If |
deriv | Order of derivative to estimate. Only orders 1 or 2 are allowed,
with the default being 1. (In some cases, a second-order derivative will
return a trivial result: for example: when |
... | Additional arguments to pass to the function that estimates derivatives. See details below. |
A data frame given by appending a new column to x
named according
to the col_name
argument, containing the estimated derivative values.
Derivatives are estimated using:
Linear regression, when method = "lin"
, via stats::lsfit()
.
Cubic smoothing spline, when method = "ss"
, via
stats::smooth.spline()
.
Polynomial trend filtering, when method = "tf"
, via
genlasso::trendfilter()
.
The second and third cases base the derivative calculation on a nonparametric
fit and should typically be used with a larger window n
. The third case
(trend filtering) is more locally adaptive than the second (smoothing
spline) and can work better when there are sharp changes in the smoothness
of the underlying values.
In the first and second cases (linear regression and smoothing spline), the
additional arguments in ...
are directly passed to the underlying
estimation function (stats::lsfit()
and stats::smooth.spline()
).
The third case (trend filtering) works a little differently. Here, a custom
set of arguments is allowed (and are internally distributed as appropriate
to genlasso::trendfilter()
, genlasso::cv.trendfilter()
, and
genlasso::coef.genlasso()
):
ord
Order of piecewise polynomial for the trend filtering fit, default is 2.
maxsteps
Maximum number of steps to take in the solution path before terminating, default is 100.
cv
Boolean indicating whether cross-validation should be used to
choose an effective degrees of freedom for the fit, default is FALSE
.
k
Number of folds if cross-validation is to be used. Default is 5.
df
Desired effective degrees of freedom for the trend filtering
fit. If cv = FALSE
, then df
must be an integer; if cv = TRUE
, then
df
should be one of "min" or "1se" indicating the selection rule to use
based on the cross-validation error curve (minimum or 1-standard-error
rule, respectively). Default is 8 when cv = FALSE
, and "1se" when cv = TRUE
.