Skip to contents

This data source of confirmed COVID-19 cases and deaths is based on reports made available by the Center for Systems Science and Engineering at Johns Hopkins University. This example data is a snapshot as of March 20, 2024, and ranges from March 1, 2020 to December 31, 2021. It is limited to California, Florida, Texas, New York, Georgia, and Pennsylvania.

It is used in the epiprocess growth rate and epi_slide vignettes.

Usage

cases_deaths_subset

Format

An object of class epi_df (inherits from tbl_df, tbl, data.frame) with 4026 rows and 6 columns.

Source

This object contains a modified part of the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University as republished in the COVIDcast Epidata API. This data set is licensed under the terms of the Creative Commons Attribution 4.0 International license by the Johns Hopkins University on behalf of its Center for Systems Science in Engineering. Copyright Johns Hopkins University 2020.

Modifications:

  • From the COVIDcast Epidata API: The case signal is taken directly from the JHU CSSE COVID-19 GitHub repository. The rate signals were computed by Delphi using Census population data. The 7-day average signals were computed by Delphi by calculating moving averages of the preceding 7 days, so the signal for June 7 is the average of the underlying data for June 1 through 7, inclusive.

  • Furthermore, the data has been limited to a very small number of rows, the signal names slightly altered, and formatted into an epi_df.

Data dictionary

The data has columns:

geo_value

the geographic value associated with each row of measurements.

time_value

the time value associated with each row of measurements.

case_rate_7d_av

7-day average signal of number of new confirmed COVID-19 cases per 100,000 population, daily

death_rate_7d_av

7-day average signal of number of new confirmed deaths due to COVID-19 per 100,000 population, daily

cases

Number of new confirmed COVID-19 cases, daily

cases_7d_av

7-day average signal of number of new confirmed COVID-19 cases, daily

Examples

# Since this is a re-exported dataset, it cannot be loaded using
# the `data()` function. `data()` looks for a file of the same name
# in the `data/` directory, which doesn't exist in this package.
# works
epiprocess::cases_deaths_subset
#> An `epi_df` object, 4,026 x 6 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * as_of     = 2024-03-20
#> 
#> # A tibble: 4,026 × 6
#>    geo_value time_value case_rate_7d_av death_rate_7d_av cases cases_7d_av
#>  * <chr>     <date>               <dbl>            <dbl> <dbl>       <dbl>
#>  1 ca        2020-03-01         0.00327         0            6        1.29
#>  2 ca        2020-03-02         0.00435         0            4        1.71
#>  3 ca        2020-03-03         0.00617         0            6        2.43
#>  4 ca        2020-03-04         0.00980         0.000363    11        3.86
#>  5 ca        2020-03-05         0.0134          0.000363    10        5.29
#>  6 ca        2020-03-06         0.0200          0.000363    18        7.86
#>  7 ca        2020-03-07         0.0294          0.000363    26       11.6 
#>  8 ca        2020-03-08         0.0341          0.000363    19       13.4 
#>  9 ca        2020-03-09         0.0410          0.000726    23       16.1 
#> 10 ca        2020-03-10         0.0468          0.000726    22       18.4 
#> # ℹ 4,016 more rows

# works
library(epiprocess)
cases_deaths_subset
#> An `epi_df` object, 4,026 x 6 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * as_of     = 2024-03-20
#> 
#> # A tibble: 4,026 × 6
#>    geo_value time_value case_rate_7d_av death_rate_7d_av cases cases_7d_av
#>  * <chr>     <date>               <dbl>            <dbl> <dbl>       <dbl>
#>  1 ca        2020-03-01         0.00327         0            6        1.29
#>  2 ca        2020-03-02         0.00435         0            4        1.71
#>  3 ca        2020-03-03         0.00617         0            6        2.43
#>  4 ca        2020-03-04         0.00980         0.000363    11        3.86
#>  5 ca        2020-03-05         0.0134          0.000363    10        5.29
#>  6 ca        2020-03-06         0.0200          0.000363    18        7.86
#>  7 ca        2020-03-07         0.0294          0.000363    26       11.6 
#>  8 ca        2020-03-08         0.0341          0.000363    19       13.4 
#>  9 ca        2020-03-09         0.0410          0.000726    23       16.1 
#> 10 ca        2020-03-10         0.0468          0.000726    22       18.4 
#> # ℹ 4,016 more rows

# fails
if (FALSE) { # \dontrun{
data(cases_deaths_subset, package = "epiprocess")
} # }