Guide: Prior Calibration using Historical Data¶

This guide explains the priors_calibration.py script, designed to help inform prior selection for Bayesian Marketing Mix Modeling (MMM) channel coefficients using historical model outputs.

Methodology: The script leverages the mean coefficient values from previous model runs (currently configured for N=2 historical models) to suggest the location (mu) parameter for a Normal prior distribution for each media channel’s beta_channel coefficient. Due to the statistical unreliability of estimating variance from very limited data points (like N=2), the script does not attempt to derive the scale (sigma) parameter from the historical data’s standard deviation. Instead, it recommends using a fixed, weakly informative sigma (configurable within the script, default is 2.0) for all channels. This approach incorporates the central tendency observed historically while acknowledging the high uncertainty in variance when data is scarce.

Introduction¶

In Bayesian MMM, specifying appropriate priors is crucial. This script assists by suggesting the center (mu) for Normal priors for media channel coefficients (beta_channel) based on the average coefficients observed in historical models. It explicitly avoids deriving the prior’s width (sigma) from limited historical data, instead recommending a standard, weakly informative value to reflect uncertainty.

Prerequisites¶

Python 3.x
Libraries:
- pandas
- numpy

Script Overview¶

Configuration: Sets the weakly informative sigma and default mu.
Imports libraries.
Prepares historical coefficient data (currently hardcoded for 2 models).
Merges data to align coefficients by channel.
Defines the list of media channels for the new model (order matters for YAML).
Calculates the suggested mu for each channel based on the historical mean. Assigns default mu if no historical data exists for a channel. Uses the configured weakly informative sigma for all.
Formats the mu and sigma lists for YAML output.
Prints suggestions and YAML-compatible output for a Normal prior.

Detailed Explanation¶

1. Configuration¶

Sets the WEAKLY_INFORMATIVE_SIGMA (default: 2.0) applied to all channels and DEFAULT_MU (default: 0.0) for channels missing historical data.

2. Import Libraries¶

Imports pandas, numpy, and warnings.

3. Data Preparation¶

Loads coefficient data from previous models (currently hardcoded dictionaries). Replace this with your actual data.

4. Merge Coefficient Data¶

Merges the historical DataFrames using an ‘outer’ join to include channels present in either model.

5. Define Media Channels¶

Defines paid_media_channels as a list of tuples (display_name, channel_variable_name). The order must match the intended order in the final YAML configuration.

6. Calculate Prior Suggestions (mu)¶

Iterates through the defined paid_media_channels.

If a channel exists in the merged historical data, it calculates the mean of the available coefficients (mean_mod_avg_model1, mean_mod_avg_model2, handling NaNs). This mean becomes the suggested mu.
If a channel is not found in historical data, the DEFAULT_MU is used.
The WEAKLY_INFORMATIVE_SIGMA is assigned to all channels.
Prints progress and warnings if N < 3.

7. Prepare Output for YAML¶

Creates lists (mu_list, sigma_list, prior_comments) ordered according to paid_media_channels.

8. Output Results¶

Prints the suggested mu for each channel and then prints the full suggested beta_channel prior section in YAML format, using Normal distribution with the generated mu list and the fixed sigma list.

Limitations¶

Reliance on Limited Data (N=2): The primary limitation is the use of only two historical data points per channel to calculate the mean (mu). While this provides a central estimate, it’s based on very little information. The true coefficient could vary significantly.
Weakly Informative Sigma: The use of a fixed sigma is a pragmatic choice due to the unreliable variance from N=2 data. It ensures the prior isn’t overly narrow but doesn’t capture channel-specific variance learned from historical data (because that variance estimate is unreliable). Consider adjusting this sigma based on domain knowledge or running sensitivity analyses with different sigma values.
Assumes Comparability: The script assumes the coefficients from the historical models are reasonably comparable to the context of the new model (e.g., similar market conditions, target variable, model structure).

Usage Instructions¶

Update Historical Data: Modify the data_model1, data_model2 dictionaries (or load data from files) with your actual historical model coefficients. Update num_historical_models if necessary. (Script location: prior_calibration/priors_calibration.py)
Define Target Channels: Ensure the paid_media_channels list in the script matches the channels and order required for your new model’s YAML configuration.
Configure Sigma/Mu: Adjust WEAKLY_INFORMATIVE_SIGMA and DEFAULT_MU in the script if desired.
Run the Script: Execute python prior_calibration/priors_calibration.py.
Review Suggestions: Examine the printed output.
Update YAML: Copy the generated beta_channel section (using Normal prior) into your model’s YAML configuration file, replacing any existing beta_channel prior.

Customisation¶

Sigma Value: Change WEAKLY_INFORMATIVE_SIGMA in the script for a different prior width.
Default Mu: Modify DEFAULT_MU in the script for missing channels.
Historical Data Source: Adapt the script to load data from CSV files or databases instead of hardcoded dictionaries.
Prior Distribution: While the output is formatted for Normal, you could adapt the final print statement if using a different prior type (e.g., HalfNormal, LogNormal), ensuring you provide the appropriate parameters. Note that HalfNormal is often preferred for beta_channel if you assume channel effects must be non-negative, but it only takes a sigma parameter (scale), so the script’s mu calculation wouldn’t directly apply in the same way.