Obtains data for selected date ranges for all geographic regions of the
United States. Available data sources and signals are documented in the
COVIDcast signal documentation.
Most (but not all) data sources are available at the county level, but the
API can also return data aggregated to metropolitan statistical areas,
hospital referral regions, or states, as desired, by using the geo_type
argument.
String identifying the data source to query. See https://cmu-delphi.github.io/delphi-epidata/api/covidcast_signals.html for a list of available data sources.
String identifying the signal from that source to query. Again, see https://cmu-delphi.github.io/delphi-epidata/api/covidcast_signals.html for a list of available signals.
Query data beginning on this date. Date object, or string in
the form "YYYY-MM-DD". If start_day
is NULL
, defaults to first day data
is available for this signal.
Query data up to this date, inclusive. Date object or string
in the form "YYYY-MM-DD". If end_day
is NULL
, defaults to the most
recent day data is available for this signal.
The geography type for which to request this data, such as "county" or "state". Defaults to "county". See https://cmu-delphi.github.io/delphi-epidata/api/covidcast_geography.html for details on which types are available.
Which geographies to return. The default, "*", fetches all geographies. To fetch specific geographies, specify their IDs as a vector or list of strings. See https://cmu-delphi.github.io/delphi-epidata/api/covidcast_geography.html for details on how to specify these IDs.
Fetch only data that was available on or before this date,
provided as a Date
object or string in the form "YYYY-MM-DD". If NULL
,
the default, return the most recent available data. Note that only one of
as_of
, issues
, and lag
should be provided; it does not make sense to
specify more than one. For more on data revisions, see
"Issue dates and revisions" below.
Fetch only data that was published or updated ("issued") on
these dates. Provided as either a single Date
object (or string in the
form "YYYY-MM-DD"), indicating a single date to fetch data issued on, or a
vector specifying two dates, start and end. In this case, return all data
issued in this range. There may be multiple rows for each observation,
indicating several updates to its value. If NULL
, the default, return the
most recently issued data.
Integer. If, for example, lag = 3
, then we fetch only data that
was published or updated exactly 3 days after the date. For example, a row
with time_value
of June 3 will only be included in the results if its
data was issued or updated on June 6. If NULL
, the default, return the
most recently issued data regardless of its lag.
The temporal resolution to request this data. Most signals are available at the "day" resolution (the default); some are only available at the "week" resolution, representing an MMWR week ("epiweek").
covidcast_signal
object with matching data. The object is a data
frame with additional metadata attached. Each row is one observation of one
signal on one day in one geographic location. Contains the following
columns:
Data source from which this observation was obtained.
Signal from which this observation was obtained.
String identifying the location, such as a state name or county FIPS code.
Date object identifying the date of this observation. For
data with time_type = "week"
, this is the first day of the corresponding
epiweek.
Date object identifying the date this estimate was issued.
For example, an estimate with a time_value
of June 3 might have been
issued on June 5, after the data for June 3rd was collected and ingested
into the API.
Integer giving the difference between issue
and time_value
,
in days.
Signal value being requested. For example, in a query for the
"confirmed_cumulative_num" signal from the "usa-facts" source, this would
be the cumulative number of confirmed cases in the area, as of the given
time_value
.
Associated standard error of the signal value, if available.
Integer indicating the sample size available in that
geography on that day; sample size may not be available for all signals,
due to privacy or other constraints, in which case it will be NA
.
Consult the signal documentation for more details on how values and standard errors are calculated for specific signals.
The returned data frame has a metadata
attribute containing metadata
about the signal contained within; see "Metadata" below for details.
For data on counties, metropolitan statistical areas, and states, this
package provides the county_census
, msa_census
, and state_census
datasets. These include each area's unique identifier, used in the
geo_values
argument to select specific areas, and basic information on
population and other Census data.
Downloading large amounts of data may be slow, so this function prints
messages for each chunk of data it downloads. To suppress these, use
base::suppressMessages()
, as in
suppressMessages(covidcast_signal("fb-survey", ...))
.
The returned object has a metadata
attribute attached containing basic
information about the signal. Use attributes(x)$metadata
to access this
metadata. The metadata is stored as a data frame of one row, and contains the
same information that covidcast_meta()
would return for a given signal.
Note that not all covidcast_signal
objects may have all fields of metadata
attached; for example, an object created with as.covidcast_signal()
using
data from another source may only contain the geo_type
variable, along with
data_source
and signal
. Before using the metadata of a covidcast_signal
object, always check for the presence of the attributes you need.
The COVIDcast API tracks updates and changes to its underlying data, and
records the first date each observation became available. For example, a data
source may report its estimate for a specific state on June 3rd on June 5th,
once records become available. This data is considered "issued" on June 5th.
Later, the data source may update its estimate for June 3rd based on revised
data, creating a new issue on June 8th. By default, covidcast_signal()
returns the most recent issue available for every observation. The as_of
,
issues
, and lag
parameters allow the user to select specific issues
instead, or to see all updates to observations. These options are mutually
exclusive, and you should only specify one; if you specify more than one, you
may get an error or confusing results.
Note that the API only tracks the initial value of an estimate and changes
to that value. If a value was first issued on June 5th and never updated,
asking for data issued on June 6th (using issues
or lag
) would not
return that value, though asking for data as_of
June 6th would. See
vignette("covidcast")
for examples.
Note also that the API enforces a maximum result row limit; results beyond
the maximum limit are truncated. This limit is sufficient to fetch
observations in all counties in the United States on one day. This client
automatically splits queries for multiple days across multiple API calls.
However, if data for one day has been issued many times, using the issues
argument may return more results than the query limit. A warning will be
issued in this case. To see all results, split your query across multiple
calls with different issues
arguments.
By default, covidcast_signal()
submits queries to the API anonymously. All
the examples in the package documentation are compatible with anonymous use
of the API, but there are some limits on anonymous queries,
including a rate limit. If you regularly query large amounts of data, please
consider registering for a free API key, which lifts
these limits. Even if your usage falls within the anonymous usage limits,
registration helps us understand who and how others are using the Delphi
Epidata API, which may in turn inform future research, data partnerships, and
funding.
If you have an API key, you can use it by setting the covidcast.auth
option once before calling covidcast_signal()
or covidcast_signals()
:
COVIDcast API documentation: https://cmu-delphi.github.io/delphi-epidata/api/covidcast.html
Documentation of all COVIDcast sources and signals: https://cmu-delphi.github.io/delphi-epidata/api/covidcast_signals.html
COVIDcast public dashboard: https://delphi.cmu.edu/covidcast/
if (FALSE) {
## Fetch all counties from 2020-05-10 to the most recent available data
covidcast_signal("fb-survey", "smoothed_cli", start_day = "2020-05-10")
## Fetch all counties on just 2020-05-10 and no other days
covidcast_signal("fb-survey", "smoothed_cli", start_day = "2020-05-10",
end_day = "2020-05-10")
## Fetch all states on 2020-05-10, 2020-05-11, 2020-05-12
covidcast_signal("fb-survey", "smoothed_cli", start_day = "2020-05-10",
end_day = "2020-05-12", geo_type = "state")
## Fetch all available data for just Pennsylvania and New Jersey
covidcast_signal("fb-survey", "smoothed_cli", geo_type = "state",
geo_values = c("pa", "nj"))
## Fetch all available data in the Pittsburgh metropolitan area
covidcast_signal("fb-survey", "smoothed_cli", geo_type = "msa",
geo_values = name_to_cbsa("Pittsburgh"))
}