Guide: Lift Test Calibration¶
This guide explains how to use experimental lift test data to calibrate and inform the saturation curves within the Abacus Media Mix Model.
Introduction¶
While MMMs estimate channel effects from observational time-series data, incorporating results from causal experiments (like geo-based lift tests or randomised controlled trials) can significantly improve the model’s accuracy, particularly regarding the shape of the channel response (saturation) curves.
Lift tests provide direct measurements of the incremental impact (delta_y) resulting from a specific change in marketing activity (delta_x) at a certain baseline level (x). By feeding this information into Abacus, the model can adjust its estimated saturation parameters (e.g., lam) and effectiveness coefficients (beta_channel) to be more consistent with these causal observations.
Lift Test Data Format¶
You need to prepare a pandas DataFrame containing the results of your lift tests. Each row represents a single experiment or observation point. The required columns are:
Coordinate Columns: Columns that match the coordinates used in your main MMM (e.g.,
channelif your lift test provides results per channel). These are essential for aligning the lift test data with the correct model parameters.x: The baseline spend or activity level before the intervention in the experiment. This should be in the same units as your channel’simpressions_coldata before any scaling applied during preprocessing.delta_x: The change in spend or activity level applied during the experiment (e.g., the increase in impressions delivered to the test group). Same units asx.delta_y: The observed incremental lift in the target variable resulting fromdelta_x. This should be in the same units as yourtarget_coldata before any scaling.sigma: The standard error or uncertainty associated with thedelta_ymeasurement. This reflects the precision of your experimental result.
Example DataFrame (df_lift_test):
channel |
x |
delta_x |
delta_y |
sigma |
|---|---|---|---|---|
TV |
1000000 |
500000 |
50 |
10 |
Radio |
50000 |
25000 |
15 |
3 |
TV |
1500000 |
500000 |
45 |
9 |
… |
… |
… |
… |
… |
Important: Ensure the values in x, delta_x, delta_y, and sigma are on the original scale of your data, before any scaling (like MaxAbsScaler or StandardScaler) is applied by the Abacus preprocessing steps. The lift test integration logic internally handles the comparison with the model’s potentially scaled parameters.
Integrating Lift Data into the Model¶
Unlike configuration parameters set in YAML, the lift test data (as a pandas DataFrame) needs to be passed directly during the initialisation of the model class, typically within the driver script (e.g., demo/runme.py).
Load Data: Load your prepared lift test CSV into a pandas DataFrame within your driver script.
import pandas as pd # Example: Load lift test data df_lift_test = pd.read_csv("path/to/your/lift_data.csv")
Initialise Model: Pass the DataFrame to the
df_lift_testparameter when creating the model instance (e.g.,DelayedSaturatedMMMorMMM).# Example within a driver script context from abacus.core.mmm_model import DelayedSaturatedMMM # Or your specific model class # ... load config, data etc. ... # Initialise the model, passing the lift test DataFrame mmm = DelayedSaturatedMMM( # ... other parameters like date_column, channel_columns, config ... df_lift_test=df_lift_test # Pass the loaded DataFrame here ) # ... proceed with fitting the model using driver.fit(mmm, ...) ...
Mechanism: How it Works¶
When df_lift_test is provided, the build_model method in abacus.core.mmm_base adds an extra likelihood component (using add_lift_measurements_to_likelihood_from_saturation).
For each row in your lift test data, the model:
Calculates the expected lift (
model_estimated_lift) based on its current estimate of the saturation curve parameters (e.g.,alpha,lam,beta_channel) for the given channel,x, anddelta_x.Compares this
model_estimated_liftto the observeddelta_yfrom your experiment.Adds a likelihood term (typically a
pm.Gammadistribution by default) that penalises deviations between the observed and estimated lift, weighted by the providedsigma.
This process encourages the model’s posterior distributions for the saturation and effectiveness parameters to align with the causal evidence from your lift tests.
Interpreting the Impact¶
To see if the lift test data influenced the model:
Compare Posteriors: Run the model once with the
df_lift_testand once without it (settingdf_lift_test=None). Compare the posterior distributions (e.g., usingmodel_trace.png,posterior_distributions.png,model_summary.csv) for the relevant parameters (alpha,lam,beta_channel) between the two runs. Lift data should ideally lead to tighter, more informed posteriors for these parameters, especiallylam.Examine Response Curves: Compare the
response_curves.pnggenerated from both runs. The run incorporating lift data should produce saturation curves that better reflect the observed experimental lift points.Check Diagnostics: Ensure model convergence diagnostics (R-hat, ESS) are still good after adding the lift test data.
Considerations and Limitations¶
Data Quality: The quality of the lift test data is paramount. Inaccurate or noisy experimental results can negatively bias your model. Ensure
delta_yandsigmaaccurately reflect the experimental outcome and its uncertainty.Coordinate Alignment: The coordinate columns in your
df_lift_test(e.g.,channel) must exactly match the values used in your main dataset and model configuration.Scaling: The
lift_test.pymodule includes functions likescale_lift_measurements, but these are not automatically applied when passingdf_lift_testto the model initialiser. The current integration assumesx,delta_x,delta_y, andsigmaare provided on their original scales. If your lift test results relate to scaled inputs/outputs, you would need to manually adjust the data or the integration logic.Model Comparability: Ensure the conditions under which the lift test was run are reasonably comparable to the period covered by your main MMM dataset. Large differences in market conditions could reduce the validity of the calibration.
Likelihood Distribution: The default likelihood used to compare observed vs. estimated lift is
pm.Gamma. This could potentially be customised if needed, but requires modifying the underlying code (lift_test.py).