Plotting Reference

Choropleth maps

This package provides a plotting function that takes a state, county, HRR, or MSA signal and generates a choropleth map, using matplotlib underneath. Detailed examples are provided in the usage examples.

covidcast.plot(data, time_value=None, plot_type='choropleth', combine_megacounties=True, **kwargs)

Given the output data frame of covidcast.signal(), plot a choropleth or bubble map.

Projections used for plotting:

  • ESRI:102003 (USA Contiguous Albers Equal Area Conic) for the contiguous US and Puerto Rico

  • ESRI:102006 (Alaska Albers Equal Area Conic) for Alaska

  • ESRI:102007 (Hawaii Albers Equal Area Conic) for Hawaii (Hawaii Albers Equal Area Conic) for Hawaii

For visual purposes, Alaska and Hawaii are moved the lower left corner of the contiguous US and Puerto Rico is moved closer to Florida.

By default, choropleths use the colormap YlOrRd, with colors scaled between 0 and the signal’s historical mean value + 3 standard deviations. Custom arguments can be passed in as kwargs for customizability. These arguments will be passed to the GeoPandas plot method; more information on these arguments can be found in the GeoPandas documentation.

Bubble maps use a purple bubble by default, with all values discretized into 8 bins between 0.1 and the signal’s historical mean value + 3 standard deviations. Values below 0 have no bubble but have the region displayed in white, and values above the mean + 3 std dev are binned into the highest bubble. Bubbles are scaled by area.

Parameters:
  • data (DataFrame) – Data frame of signal values, as returned from covidcast.signal().

  • time_value (date) – If multiple days of data are present in data, map only values from this day. Defaults to plotting the most recent day of data in data.

  • combine_megacounties (bool) – For each state, display all counties without a signal value as a single polygon with the megacounty value, as opposed to plotting all the county boundaries. Defaults to True.

  • kwargs (Any) – Optional keyword arguments passed to GeoDataFrame.plot().

  • plot_type (str) – Type of plot to create. Either choropleth (default) or bubble map.

Return type:

Figure

Returns:

Matplotlib figure object.

Animate a signal over time

A signal DataFrame can be used to generate an animated choropleth of the signal values over time.

covidcast.animate(data, filepath, fps=3, dpi=150, **kwargs)

Generate an animated video file of a signal over time.

Given a signal DataFrame, generates the choropleth for each day to form an animation of the signal. Accepts arguments for video parameters as well as optional plotting arguments. Supported output formats are listed in the imageio ffmpeg documentation.

Parameters:
  • data (DataFrame) – DataFrame for a single signal over time.

  • filepath (str) – Path where video will be saved. Filename must contain supported extension.

  • fps (int) – Frame rate in frames per second for animation. Defaults to 3.

  • dpi (int) – Dots per inch for output video. Defaults to 150 on a 12.8x9.6 figure (1920x1440).

  • kwargs (Any) – Optional keyword arguments passed to covidcast.plot().

Return type:

None

Returns:

None

Creating a GeoDataFrame

A function for generating a GeoPandas GeoDataFrame with signal information appended is also provided if the user desires more control over their plotting.

covidcast.get_geo_df(data, geo_value_col='geo_value', geo_type_col='geo_type', join_type='right', combine_megacounties=False)

Augment a covidcast.signal() data frame with the shape of each geography.

This method takes in a pandas DataFrame object and returns a GeoDataFrame object from the GeoPandas package. The GeoDataFrame will contain the geographic shape corresponding to every row in its geometry column; for example, a data frame of county-level signal observations will be returned with the shape of every county.

After detecting the geography type (state, county, HRR, and MSA are currently supported) of the input, this function builds a GeoDataFrame that contains state and geometry information from the Census or CMS for that geography type. By default, it will take the signal data (left side) and geo data (right side) and right join them, so all states/counties will always be present regardless of whether data contains values for those locations. left, outer, and inner joins are also supported and can be selected with the join_type argument.

If combine_megacounties=False (default) all counties without a signal value will be given the value of the megacounty if present. If combine_megacounties=True, a left join will be conducted and the megacounty rows will be given a polygon of the union of all constituent counties without a value. Other joins will not use megacounties. See the geographic coding documentation for information about megacounties.

By default, this function identifies the geography for each row of the input data frame using its geo_value column, matching data frames returned by covidcast.signal(), but the geo_value_col and geo_type_col arguments can be provided to match geographies for data frames with different column names.

Geographic data is sourced from 1:5,000,000-scale shapefiles from the 2019 US Census Cartographic Boundary Files and the CMS Data Website.

Parameters:
  • data (DataFrame) – DataFrame of values and geographies.

  • geo_value_col (str) – Name of column containing values of interest.

  • geo_type_col (str) – Name of column containing geography type.

  • join_type (str) – Type of join to do between input data (left side) and geo data (right side). Must be one of right (default), left, outer, or inner.

  • combine_megacounties (bool) – For each state, return all counties without a signal value as a single row and polygon with the megacounty value. Defaults to False.

Return type:

GeoDataFrame

Returns:

GeoDataFrame containing all columns from the input data, along with a geometry column (containing a polygon) and a state_fips column (a two-digit FIPS code identifying the US state containing this geography). For MSAs that span multiple states, the first state in the MSA name is provided. The geometry is given in the GCS NAD83 coordinate system for states, counties, and MSAs, and WGS84 for HRRs.