Computes correlations between two covidcast_signal data frames, allowing
for slicing by geo location, or by time. (Only the latest issue from each
data frame is used for correlations.) See the correlations vignette
for examples: vignette("correlation-utils", package = "covidcast").
The covidcast_signal data frames to correlate.
Time shifts (in days) to consider for x and y,
respectively, before computing correlations. Default is 0. Negative shifts
translate into in a lag value and positive shifts into a lead value; for
example, setting dt_y = 2 results in values of y being shifted earlier
(leading) by 2 days before correlation, so values of x are correlated
with values of y from two days later.
If "geo_value", then correlations are computed for each geo location, over all time. Each correlation is measured between two time series at the same location. If "time_value", then correlations are computed for each time, over all geo locations. Each correlation is measured between all locations at one time. Default is "geo_value".
Arguments to pass to cor(), with "na.or.complete" the
default for use (different than cor()) and "pearson" the default for
method (same as cor()).
A data frame with first column geo_value or time_value (matching
by), and second column value, which gives the correlation.
if (FALSE) {
# For all these examples, let x and y be two signals measured at the county
# level over several months.
## `by = "geo_value"`
# Correlate each county's time series together, returning one correlation per
# county:
covidcast_cor(x, y, by = "geo_value")
# Correlate x in each county with values of y 14 days later
covidcast_cor(x, y, dt_y = 14, by = "geo_value")
# Equivalently, x can be shifted -14 days:
covidcast_cor(x, y, dt_x = -14, by = "geo_value")
## `by = "time_value"`
# For each date, correlate x's values in every county against y's values in
# the same counties. Returns one correlation per date:
covidcast_cor(x, y, by = "time_value")
# Correlate x values across counties against y values 7 days later
covidcast_cor(x, y, dt_y = 7, by = "time_value")
}