Abacus Sketch: Depict (depict.py)

This module provides high-level functions to generate a comprehensive set of visualisations and summary statistics for analysing input data and evaluating the results of a fitted Abacus Marketing Mix Model (MMM). It orchestrates calls to lower-level plotting functions within the abacus.sketch package and methods available on the fitted model object.

Main Reporting Functions

These functions generate multiple plots and save summary data to the specified results directory.

describe_mmm_training

def describe_mmm_training(
    config: dict,
    processed_data: pd.DataFrame,
    mmm, # Fitted MMM instance
    results_dir: str
) -> None:

Generates and saves a suite of plots and summaries related to the model fit on the training data.

Generated Outputs (saved in results_dir):

  • dashboard_input_data.csv: The processed data used for training.

  • components_contributions.png: Plot showing contributions of different model components over time (mean and HDI). (Calls mmm.plot_components_contributions)

  • model_summary.csv: ArviZ summary statistics (mean, median, HDI) for key model parameters (intercept, sigma, channel betas, saturation params).

  • all_decomp.csv: Mean contributions of all components over time. (Calls mmm.compute_mean_contributions_over_time)

  • media_contribution_share.png: Forest plot of channel contribution shares (posterior distribution). (Calls mmm.plot_channel_contribution_share_hdi)

  • media_contribution_mean.png: Bar chart of mean channel contribution shares with HDI. (Calls mmm.plot_channel_contribution_stats)

  • media_contribution_median.png: Bar chart of median channel contribution shares with HDI. (Calls mmm.plot_channel_contribution_stats)

  • weekly_media_and_baseline_contribution.png: Stacked area plot of contributions. (Calls plot_results.all_contributions_plot)

  • weekly_media_contribution.png: Stacked area plot of only media contributions. (Calls plot_results.plot_channel_contributions)

  • model_priors_and_posteriors.png: Trace plots for key model parameters. (Calls arviz.plot_trace)

  • response_curves.png: Direct contribution curves for each channel, optionally showing fitted saturation curves. (Calls mmm.plot_direct_contribution_curves)

  • ROI-related plots generated by plot_results.plot_roi and plot_results.plot_roi_distribution.

Parameters:

  • config (dict): The configuration dictionary.

  • processed_data (pd.DataFrame): The processed input data used for fitting.

  • mmm: The fitted Abacus MMM instance (e.g., DelayedSaturatedMMM).

  • results_dir (str): The directory path where output files will be saved.

Returns:

  • None.


describe_mmm_prediction

def describe_mmm_prediction(
    config: dict,
    input_data_processed: pd.DataFrame,
    mmm, # Fitted MMM instance
    results_dir: str
) -> None:

Generates and saves plots and summaries related to the model’s predictive performance, potentially including out-of-sample checks if a train/test split was used.

Generated Outputs (saved in results_dir):

  • waterfall_plot_components_decomposition.png: Waterfall plot showing decomposition. (Calls plot_results.plot_waterfall_components_decomposition)

  • model_fit_predictions.png: (Optional) Plot comparing posterior predictions against actual values for the test set. Only generated if config['train_test_ratio'] < 1.0. (Calls plot_diagnostics.plot_posterior_predictions)

  • media_performance_effect.csv: Summary statistics (mean, median, HDI) for channel effectiveness (beta_channel).

  • media_performance_roi.csv: Summary statistics (mean, median, HDI) for channel Return on Investment (ROI). (Calls compute_roi_summary)

  • media_performance_cost_per_target.csv: Summary statistics (mean, median, HDI) for channel Cost Per Target acquisition/unit. (Calls compute_cost_per_target_summary)

Parameters:

  • config (dict): The configuration dictionary (used to get train_test_ratio).

  • input_data_processed (pd.DataFrame): The full processed input data (used for splitting).

  • mmm: The fitted Abacus MMM instance.

  • results_dir (str): The directory path where output files will be saved.

Returns:

  • None.


describe_input_data

def describe_input_data(
    input_data, # InputData instance
    results_dir: str,
    suffix: str
) -> None:

Generates plots and reports describing the raw input data.

Generated Outputs (saved in results_dir):

  • Plots of all input metrics over time (via plot_input.plot_all_metrics).

  • outliers_{suffix}.txt: Text report identifying potential outliers (via outliers.print_outliers).

Parameters:

  • input_data: An instance of abacus.prepro.input_data.InputData.

  • results_dir (str): The directory path where output files will be saved.

  • suffix (str): A suffix (e.g., “raw”) added to output filenames.

Returns:

  • None.


describe_config

def describe_config(output_dir: str, config: str, git_sha: str) -> None:

Saves configuration details to facilitate reproducibility.

Generated Outputs (saved in output_dir):

  • git_sha.txt: Contains the Git commit hash of the code version used.

  • config.yaml: Contains the raw string content of the configuration file used for the run.

Parameters:

  • output_dir (str): The directory path where output files will be saved.

  • config (str): The raw string content of the YAML configuration file.

  • git_sha (str): The Git commit hash.

Returns:

  • None.

Helper/Metric Functions

compute_roi_summary

def compute_roi_summary(model, data: pd.DataFrame, config: dict) -> pd.DataFrame:

Computes summary statistics (mean, median, 5th/95th percentiles) for the Return on Investment (ROI) for each channel.

Parameters:

  • model: The fitted Abacus MMM instance.

  • data (pd.DataFrame): The input data containing channel spend columns.

  • config (dict): Configuration dictionary specifying media spend columns.

Returns:

  • pd.DataFrame: DataFrame indexed by channel name (plus ‘blended’), with columns ‘0.05’, ‘0.95’, ‘median’, ‘mean’ representing ROI statistics.


compute_cost_per_target_summary

def compute_cost_per_target_summary(model, data: pd.DataFrame, config: dict) -> pd.DataFrame:

Computes summary statistics (mean, median, 5th/95th percentiles) for the Cost Per Target (e.g., cost per acquisition) for each channel. This is calculated as the inverse of ROI.

Parameters:

  • model: The fitted Abacus MMM instance.

  • data (pd.DataFrame): The input data containing channel spend columns.

  • config (dict): Configuration dictionary specifying media spend columns.

Returns:

  • pd.DataFrame: DataFrame indexed by channel name (plus ‘blended’), with columns ‘0.05’, ‘0.95’, ‘median’, ‘mean’ representing Cost Per Target statistics.


describe_all_media_spend

def describe_all_media_spend(per_observation_df: pd.DataFrame) -> pd.DataFrame:

Calculates the total cost and percentage share of total cost for each channel based on cost columns in the input DataFrame.

Parameters:

  • per_observation_df (pd.DataFrame): A DataFrame expected to have columns ending in “_cost” representing spend per observation for each channel.

Returns:

  • pd.DataFrame: A DataFrame indexed by channel cost column name, with columns “Total Cost” and “% of total”.


quick_stats

def quick_stats(model) -> pd.DataFrame:

Generates a concise ArviZ summary table for key model parameters (intercept, likelihood_sigma, beta_channel, alpha, lam).

Parameters:

  • model: The fitted Abacus MMM instance containing fit_result.

Returns:

  • pd.DataFrame: The ArviZ summary DataFrame including mean, median, sd, and HDI.


get_media_effect_df

def get_media_effect_df(marketing_mix_model) -> pd.DataFrame:

Extracts the mean posterior channel contributions over time and returns them as a DataFrame.

Parameters:

  • marketing_mix_model: The fitted Abacus MMM instance.

Returns:

  • pd.DataFrame: DataFrame with ‘date’ as index and columns for each channel’s mean contribution.


get_roi_df

def get_roi_df(model, data: pd.DataFrame, config: dict) -> dict:

Calculates the mean Return on Investment (ROI) for each channel. Note: This seems to provide only the mean, whereas compute_roi_summary provides more detailed statistics.

Parameters:

  • model: The fitted Abacus MMM instance.

  • data (pd.DataFrame): Input data containing channel spend.

  • config (dict): Configuration specifying media spend columns.

Returns:

  • dict: A dictionary where keys are channel names and values are the mean ROI for each.


(Note: This module also contains functions describe_training and describe_prediction which appear to be duplicates or older versions of describe_mmm_training and describe_mmm_prediction. The function weekly_spend_by_channel seems redundant as its functionality is covered within describe_mmm_training. The helper _dump_posterior_metrics appears unused.)