bucky.viz.plot

Creates line plots with confidence intervals at the ADM0, ADM1, or ADM2 level.

Module Contents

Functions

add_col_data_to_plot(axis, col, sim_data, quantiles_list)

Adds simulation data for requested column to initialized matplotlib axis object.

add_hist_data_to_plot(axis, historical_data, adm_key, adm_value, col, sim_max_date)

Add historical data for requested column and area to initialized matplotlib axis object.

get_plot_title(l_table, adm_key, adm_value)

Determine plot title for a given area based on adm code and lookup table.

main(args=None)

Main entrypoint.

make_plots(adm_levels, input_directory, output_directory, lookup_df, plot_hist, plot_columns, quantiles, window_size, end_date, hist_file, min_hist_points, admin1=None, hist_start=None)

Wrapper function around plot. Creates plots, aggregating data if necessary.

plot(out_dir, lookup_df, key, sim_data, hist_data, plot_columns, quantiles)

Given a dataframe and a key, creates plots with requested columns.

preprocess_historical_dates(historical_data, hist_start_date, plot_start, plot_end, min_data_points)

Check that historical data has the correct date range and requested number of data points.

bucky.viz.plot.default_plot_cols = ['daily_reported_cases', 'daily_deaths'][source]
bucky.viz.plot.parser[source]
bucky.viz.plot.add_col_data_to_plot(axis, col, sim_data, quantiles_list)[source]

Adds simulation data for requested column to initialized matplotlib axis object.

Parameters
  • axis (matplotlib.axes.Axes) – Previously initialized axis object

  • col (str) – Column to add to plot

  • sim_data (pandas.DataFrame) – Area simulation data

  • quantiles_list (list of float, or None) – List of quantiles to plot. If None, will plot all available quantiles in data.

Returns

axis – Modified axis object with added data

Return type

matplotlib.axes.Axes

bucky.viz.plot.add_hist_data_to_plot(axis, historical_data, adm_key, adm_value, col, sim_max_date)[source]

Add historical data for requested column and area to initialized matplotlib axis object.

Parameters
  • axis (matplotlib.axes.Axes) – Previously initialized axis object

  • historical_data (pandas.DataFrame) – Dataframe of historical data

  • adm_key (str) – Admin key to use to relate simulation data and geographic areas

  • adm_value (int) – Admin code value for area

  • col (str) – Column to add to plot

  • sim_max_date (pandas.Timestamp) – Latest date in simulation data

Returns

axis – Modified axis object with added data

Return type

matplotlib.axes.Axes

bucky.viz.plot.get_plot_title(l_table, adm_key, adm_value)[source]

Determine plot title for a given area based on adm code and lookup table.

For ADM0 and ADM1 plots, uses names. For ADM2 plots, the ADM1 name is included in addition to the ADM2 name.

Parameters
  • l_table (pandas.DataFrame) – Dataframe containing information relating different geographic areas

  • adm_key (str) – Admin level key

  • adm_value (int) – Admin code value for area

Returns

plot_title – Formatted string to use for plot title

Return type

str

bucky.viz.plot.main(args=None)[source]

Main entrypoint.

bucky.viz.plot.make_plots(adm_levels, input_directory, output_directory, lookup_df, plot_hist, plot_columns, quantiles, window_size, end_date, hist_file, min_hist_points, admin1=None, hist_start=None)[source]

Wrapper function around plot. Creates plots, aggregating data if necessary.

Parameters
  • adm_levels (list of str) – List of ADM levels to make plots for

  • input_directory (str) – Location of simulation data

  • output_directory (str) – Parent directory to place created plots.

  • lookup_df (pandas.DataFrame) – Lookup table for geographic mapping information

  • plot_hist (bool) – If True, will plot historical data.

  • plot_columns (list of str) – List of columns to plot from data.

  • quantiles (list of float, or None) – List of quantiles to plot. If None, will plot all available quantiles in data.

  • window_size (int) – Size of window (in days) to apply to historical data.

  • end_date (str) – Plot data until this date. Must be formatted as YYYY-MM-DD. If None, uses last date in simulation.

  • hist_file (str) – Path to historical data file. If None, uses either CSSE or Covid Tracking data depending on columns requested.

  • min_hist_points (int) – Minimum number of historical data points to plot.

  • admin1 (list of str, or None) – List of admin1 values to make plots for. If None, a plot will be created for every unique admin1 values. Otherwise, plots are only made for those requested.

  • hist_start (str, or None) – Plot historical data from this point (formatted as YYYY-MM-DD). If None, aligns with simulation start date.

bucky.viz.plot.plot(out_dir, lookup_df, key, sim_data, hist_data, plot_columns, quantiles)[source]

Given a dataframe and a key, creates plots with requested columns.

For example, a pandas.DataFrame with state-level data would create a plot for each unique state. Simulation data is plotted as a line with shaded confidence intervals. Historical data is added as scatter points if requested.

Parameters
  • out_dir (str) – Location to place created plots.

  • lookup_df (pandas.DataFrame) – Dataframe containing information relating different geographic areas

  • key (str) – Key to use to relate simulation data and geographic areas. Must appear in lookup and simulation data (and historical data if applicable)

  • sim_data (pandas.DataFrame) – Simulation data to plot

  • hist_data (pandas.DataFrame) – Historical data to add to plot

  • plot_columns (list of str) – Columns to plot

  • quantiles (list of float, or None) – List of quantiles to plot. If None, will plot all available quantiles in data.

bucky.viz.plot.preprocess_historical_dates(historical_data, hist_start_date, plot_start, plot_end, min_data_points)[source]

Check that historical data has the correct date range and requested number of data points.

Parameters
  • historical_data (pandas.DataFrame) – Dataframe with requested historical data

  • hist_start_date (str or None) – Plot historical data from this point (formatted as YYYY-MM-DD). If None, aligns with simulation start date.

  • plot_start (pandas.Timestamp) – Earliest date appearing in simulation data

  • plot_end (pandas.Timestamp) – Latest date appearing in simulation data

  • min_data_points (int) – Minimum number of historical data points to plot

Returns

historical_data – Historical data with the correct date range and number of points

Return type

pandas.DataFrame