Abacus Core Mixins: MMM Calibrate (mmm_calibrate.py)

This module provides the MMMCalibrateMixin class, designed to be inherited by Marketing Mix Model (MMM) classes in Abacus. It allows for the integration of external experimental data, such as lift tests, directly into the model’s likelihood function. This process helps to calibrate or constrain model parameters based on ground-truth experimental results, potentially improving model accuracy and realism.

MMMCalibrateMixin Class

This mixin assumes the inheriting class has attributes like:

  • model: The PyMC model object (pm.Model).

  • channel_columns: A list of channel names used in the model.

  • channel_transformer: The fitted scaler object used for channel data.

  • target_transformer: The fitted scaler object used for the target variable.

  • Model parameters defined within the PyMC model context, specifically those related to channel saturation (e.g., alpha, lam) and effectiveness (e.g., beta_channel).

Methods

add_lift_test_measurements

def add_lift_test_measurements(
    self,
    df_lift_test: pd.DataFrame,
    dist: Any = pm.Gamma,
    name: str = "lift_measurements",
) -> None:

Adds likelihood terms to the PyMC model based on provided lift test measurements.

This method takes a DataFrame containing results from lift tests (controlled experiments measuring the incremental impact of changing spend in a specific channel) and incorporates them into the model fitting process. It does this by:

  1. Scaling: Scaling the lift test’s spend levels (x, delta_x), observed outcome lift (delta_y), and uncertainty (sigma) using the same transformers applied to the main model’s channel and target data. This ensures consistency in scale.

  2. Calculating Model-Implied Lift: For each lift test observation, calculating the expected lift (model_delta_y) based on the difference in the model’s saturation curve between the baseline spend (x_scaled) and the increased spend (x_scaled + delta_x_scaled). This calculation uses the current values of the model’s saturation parameters (e.g., alpha, lam) and channel effectiveness (beta_channel) within the PyMC model context. Note: The current implementation appears to use a sigmoid function combined with beta_channel to represent the saturated effect.

  3. Adding Likelihood: Defining a new likelihood term within the PyMC model using the specified distribution (dist). This term compares the scaled observed lift (delta_y_scaled) from the experiment to the model-implied lift (model_delta_y), using the scaled observed uncertainty (sigma_scaled) to define the likelihood’s shape (e.g., as the standard deviation for pm.Normal or used to derive parameters for pm.Gamma).

By adding this likelihood, the model fitting process (e.g., MCMC sampling) is encouraged to find parameter values that not only fit the main time-series data but also align with the results observed in the lift tests.

Parameters:

  • df_lift_test (pd.DataFrame): A DataFrame containing the lift test results. It must include the following columns:

    • channel: The name of the channel tested (must match a name in self.channel_columns).

    • x: The baseline spend level during the test.

    • delta_x: The change in spend applied during the test.

    • delta_y: The observed change (lift) in the target outcome variable.

    • sigma: The uncertainty (e.g., standard error) associated with the observed delta_y.

  • dist (pm.Distribution, optional): The PyMC distribution class to use for the likelihood term (e.g., pm.Normal, pm.Gamma). Defaults to pm.Gamma. Note that if using pm.Gamma, the method converts the model-implied lift (mu) and scaled sigma into the alpha and beta parameters required by pm.Gamma.

  • name (str, optional): A unique name for the likelihood variable added to the PyMC model. Defaults to "lift_measurements".

Raises:

  • RuntimeError: If the model (self.model) has not been built before calling this method, or if required model parameters (e.g., alpha, lam, beta_channel) are missing.

  • KeyError: If df_lift_test is missing any of the required columns.

  • AttributeError: If self.channel_transformer or self.target_transformer are missing or not fitted.

  • ValueError: If scaling fails, potentially because a channel listed in df_lift_test is not present in self.channel_columns.

Notes:

  • This method should be called after defining the main model structure but before starting the fitting process (e.g., pm.sample()).

  • The accuracy of the calibration depends heavily on the quality and relevance of the lift test data.

  • The current implementation calculates model-implied lift based on the saturation function directly and does not explicitly account for adstock effects within the lift calculation itself. Ensure the lift test design and interpretation align with this assumption.