# Abacus Sketch: Depict (`depict.py`) This module provides high-level functions to generate a comprehensive set of visualisations and summary statistics for analysing input data and evaluating the results of a fitted Abacus Marketing Mix Model (MMM). It orchestrates calls to lower-level plotting functions within the `abacus.sketch` package and methods available on the fitted model object. ::: abacus.sketch.depict ## Main Reporting Functions These functions generate multiple plots and save summary data to the specified results directory. ### `describe_mmm_training` ```python def describe_mmm_training( config: dict, processed_data: pd.DataFrame, mmm, # Fitted MMM instance results_dir: str ) -> None: ``` Generates and saves a suite of plots and summaries related to the model fit on the training data. **Generated Outputs (saved in `results_dir`):** - `dashboard_input_data.csv`: The processed data used for training. - `components_contributions.png`: Plot showing contributions of different model components over time (mean and HDI). (Calls `mmm.plot_components_contributions`) - `model_summary.csv`: ArviZ summary statistics (mean, median, HDI) for key model parameters (intercept, sigma, channel betas, saturation params). - `all_decomp.csv`: Mean contributions of all components over time. (Calls `mmm.compute_mean_contributions_over_time`) - `media_contribution_share.png`: Forest plot of channel contribution shares (posterior distribution). (Calls `mmm.plot_channel_contribution_share_hdi`) - `media_contribution_mean.png`: Bar chart of mean channel contribution shares with HDI. (Calls `mmm.plot_channel_contribution_stats`) - `media_contribution_median.png`: Bar chart of median channel contribution shares with HDI. (Calls `mmm.plot_channel_contribution_stats`) - `weekly_media_and_baseline_contribution.png`: Stacked area plot of contributions. (Calls `plot_results.all_contributions_plot`) - `weekly_media_contribution.png`: Stacked area plot of only media contributions. (Calls `plot_results.plot_channel_contributions`) - `model_priors_and_posteriors.png`: Trace plots for key model parameters. (Calls `arviz.plot_trace`) - `response_curves.png`: Direct contribution curves for each channel, optionally showing fitted saturation curves. (Calls `mmm.plot_direct_contribution_curves`) - ROI-related plots generated by `plot_results.plot_roi` and `plot_results.plot_roi_distribution`. **Parameters:** - `config` (`dict`): The configuration dictionary. - `processed_data` (`pd.DataFrame`): The processed input data used for fitting. - `mmm`: The fitted Abacus MMM instance (e.g., `DelayedSaturatedMMM`). - `results_dir` (`str`): The directory path where output files will be saved. **Returns:** - `None`. --- ### `describe_mmm_prediction` ```python def describe_mmm_prediction( config: dict, input_data_processed: pd.DataFrame, mmm, # Fitted MMM instance results_dir: str ) -> None: ``` Generates and saves plots and summaries related to the model's predictive performance, potentially including out-of-sample checks if a train/test split was used. **Generated Outputs (saved in `results_dir`):** - `waterfall_plot_components_decomposition.png`: Waterfall plot showing decomposition. (Calls `plot_results.plot_waterfall_components_decomposition`) - `model_fit_predictions.png`: (Optional) Plot comparing posterior predictions against actual values for the test set. Only generated if `config['train_test_ratio'] < 1.0`. (Calls `plot_diagnostics.plot_posterior_predictions`) - `media_performance_effect.csv`: Summary statistics (mean, median, HDI) for channel effectiveness (`beta_channel`). - `media_performance_roi.csv`: Summary statistics (mean, median, HDI) for channel Return on Investment (ROI). (Calls `compute_roi_summary`) - `media_performance_cost_per_target.csv`: Summary statistics (mean, median, HDI) for channel Cost Per Target acquisition/unit. (Calls `compute_cost_per_target_summary`) **Parameters:** - `config` (`dict`): The configuration dictionary (used to get `train_test_ratio`). - `input_data_processed` (`pd.DataFrame`): The full processed input data (used for splitting). - `mmm`: The fitted Abacus MMM instance. - `results_dir` (`str`): The directory path where output files will be saved. **Returns:** - `None`. --- ### `describe_input_data` ```python def describe_input_data( input_data, # InputData instance results_dir: str, suffix: str ) -> None: ``` Generates plots and reports describing the raw input data. **Generated Outputs (saved in `results_dir`):** - Plots of all input metrics over time (via `plot_input.plot_all_metrics`). - `outliers_{suffix}.txt`: Text report identifying potential outliers (via `outliers.print_outliers`). **Parameters:** - `input_data`: An instance of `abacus.prepro.input_data.InputData`. - `results_dir` (`str`): The directory path where output files will be saved. - `suffix` (`str`): A suffix (e.g., "raw") added to output filenames. **Returns:** - `None`. --- ### `describe_config` ```python def describe_config(output_dir: str, config: str, git_sha: str) -> None: ``` Saves configuration details to facilitate reproducibility. **Generated Outputs (saved in `output_dir`):** - `git_sha.txt`: Contains the Git commit hash of the code version used. - `config.yaml`: Contains the raw string content of the configuration file used for the run. **Parameters:** - `output_dir` (`str`): The directory path where output files will be saved. - `config` (`str`): The raw string content of the YAML configuration file. - `git_sha` (`str`): The Git commit hash. **Returns:** - `None`. ## Helper/Metric Functions ### `compute_roi_summary` ```python def compute_roi_summary(model, data: pd.DataFrame, config: dict) -> pd.DataFrame: ``` Computes summary statistics (mean, median, 5th/95th percentiles) for the Return on Investment (ROI) for each channel. **Parameters:** - `model`: The fitted Abacus MMM instance. - `data` (`pd.DataFrame`): The input data containing channel spend columns. - `config` (`dict`): Configuration dictionary specifying media spend columns. **Returns:** - `pd.DataFrame`: DataFrame indexed by channel name (plus 'blended'), with columns '0.05', '0.95', 'median', 'mean' representing ROI statistics. --- ### `compute_cost_per_target_summary` ```python def compute_cost_per_target_summary(model, data: pd.DataFrame, config: dict) -> pd.DataFrame: ``` Computes summary statistics (mean, median, 5th/95th percentiles) for the Cost Per Target (e.g., cost per acquisition) for each channel. This is calculated as the inverse of ROI. **Parameters:** - `model`: The fitted Abacus MMM instance. - `data` (`pd.DataFrame`): The input data containing channel spend columns. - `config` (`dict`): Configuration dictionary specifying media spend columns. **Returns:** - `pd.DataFrame`: DataFrame indexed by channel name (plus 'blended'), with columns '0.05', '0.95', 'median', 'mean' representing Cost Per Target statistics. --- ### `describe_all_media_spend` ```python def describe_all_media_spend(per_observation_df: pd.DataFrame) -> pd.DataFrame: ``` Calculates the total cost and percentage share of total cost for each channel based on cost columns in the input DataFrame. **Parameters:** - `per_observation_df` (`pd.DataFrame`): A DataFrame expected to have columns ending in "_cost" representing spend per observation for each channel. **Returns:** - `pd.DataFrame`: A DataFrame indexed by channel cost column name, with columns "Total Cost" and "% of total". --- ### `quick_stats` ```python def quick_stats(model) -> pd.DataFrame: ``` Generates a concise ArviZ summary table for key model parameters (intercept, likelihood_sigma, beta_channel, alpha, lam). **Parameters:** - `model`: The fitted Abacus MMM instance containing `fit_result`. **Returns:** - `pd.DataFrame`: The ArviZ summary DataFrame including mean, median, sd, and HDI. --- ### `get_media_effect_df` ```python def get_media_effect_df(marketing_mix_model) -> pd.DataFrame: ``` Extracts the mean posterior channel contributions over time and returns them as a DataFrame. **Parameters:** - `marketing_mix_model`: The fitted Abacus MMM instance. **Returns:** - `pd.DataFrame`: DataFrame with 'date' as index and columns for each channel's mean contribution. --- ### `get_roi_df` ```python def get_roi_df(model, data: pd.DataFrame, config: dict) -> dict: ``` Calculates the mean Return on Investment (ROI) for each channel. *Note: This seems to provide only the mean, whereas `compute_roi_summary` provides more detailed statistics.* **Parameters:** - `model`: The fitted Abacus MMM instance. - `data` (`pd.DataFrame`): Input data containing channel spend. - `config` (`dict`): Configuration specifying media spend columns. **Returns:** - `dict`: A dictionary where keys are channel names and values are the mean ROI for each. --- *(Note: This module also contains functions `describe_training` and `describe_prediction` which appear to be duplicates or older versions of `describe_mmm_training` and `describe_mmm_prediction`. The function `weekly_spend_by_channel` seems redundant as its functionality is covered within `describe_mmm_training`. The helper `_dump_posterior_metrics` appears unused.)*