Reference: Output Files

This document describes the files typically generated in the results/ directory after running the Abacus Media Mix Model.

Configuration and Summary Files

  • model_config.json: JSON detailing the prior distributions used for all model parameters (intercept, channel betas, adstock alpha, saturation lambda, control/Fourier gammas, likelihood parameters). Includes distribution names (e.g., “Normal”, “Beta”) and their specified keyword arguments (mu, sigma, alpha, beta, etc.).

  • model_summary.csv: CSV containing summary statistics for the posterior distributions of model parameters. Columns typically include:

    • mean: Mean of the posterior samples.

    • sd: Standard deviation of the posterior samples.

    • hdi_5%, hdi_95%: Lower and upper bounds of the 90% Highest Density Interval.

    • mcse_mean, mcse_sd: Monte Carlo Standard Error for the mean and standard deviation estimates.

    • ess_bulk, ess_tail: Bulk and tail Effective Sample Size estimates (diagnostics for MCMC efficiency).

    • r_hat: Gelman-Rubin convergence diagnostic (should be close to 1.0).

    • median: Median of the posterior samples.

  • output.txt: A simple text file logging basic run information, such as the paths to the configuration and input data files used for the run.

Core Model Output & Data

  • model.nc: (Primary Output) The core model output file containing the full ArviZ InferenceData object (posterior samples, sample stats, observed data, priors, etc.) saved in NetCDF format. This file is essential for reloading the fitted model (MMM.load()) and performing post-hoc analyses like LOO-CV calculation (az.loo()) or detailed parameter inspection.

  • all_decomp.csv: A time-series CSV (indexed by date) showing the mean contribution of each model component over time. Columns include:

    • intercept: Contribution of the base intercept.

    • One column per media channel (e.g., TV, Radio): Contribution of that channel.

    • One column per control variable (if used): Contribution of that control variable.

    • Columns for Fourier modes (if used, e.g., sin_1, cos_1): Contribution of seasonality components.

    • May include other derived columns like total baseline or total media contribution.

  • dashboard_input_data.csv: (Optional, may depend on driver implementation) A CSV representation of the final data fed into the PyMC model after all preprocessing steps (scaling, transformations, feature generation like Fourier modes). Useful for debugging or external analysis/visualisation.

  • optimization_results.csv: (Optional, generated only if optimisation is run) CSV detailing the budget optimisation results. Columns typically include:

    • channel: Name of the media channel.

    • initial_budget: Budget allocation before optimisation.

    • optimal_budget: Recommended budget allocation after optimisation.

    • initial_contribution: Estimated contribution from the initial budget.

    • optimal_contribution: Estimated contribution from the optimal budget.

Visualisations (.png files)

(See Interpreting Results Guide for explanations of how to interpret these plots.)

  • budget_optimisation.png: (If optimisation run) Compares initial vs. optimised budget and contribution.

  • channel_contribution_as_function_of_cost_share.png: Shows contribution vs. relative spend multiplier (delta).

  • components_contributions.png: Time series plot showing contributions of different model components (media, controls, baseline).

  • media_contribution_mean.png: Bar chart of mean percentage contribution per channel.

  • media_contribution_median.png: Bar chart of median percentage contribution per channel.

  • media_contribution_share.png: Forest plot showing the distribution of contribution shares per channel.

  • media_roi_mean.png: Bar chart of mean ROI per channel.

  • media_roi_median.png: Bar chart of median ROI per channel.

  • roi_distribution.png: Density plots showing the posterior distribution of ROI for each channel.

  • model_fit_predictions.png: Plot comparing model predictions against actual data (in-sample and potentially out-of-sample).

  • model_trace.png: MCMC trace plots for diagnosing sampler convergence.

  • posterior_distributions.png: Plots showing the posterior distributions for model parameters.

  • response_curves.png: Shows the estimated saturation curves (contribution vs. untransformed spend) for each channel.

  • waterfall_plot_components_decomposition.png: Waterfall plot showing the build-up of the average prediction from different components.

(Note: The exact set of generated plots might vary slightly depending on the specific driver script implementation and configuration.)

Derived Metrics

Metrics like ROI and Cost per Target are typically calculated from the posterior samples stored in model.nc or the summaries in model_summary.csv and all_decomp.csv. While helper functions in abacus.sketch.depict might calculate and print these, they may not be saved to dedicated media_performance_*.csv files by default.