Lift Test Integration (lift_test.py)

This module provides functions to incorporate empirical lift test data into the PyMC model’s likelihood, allowing for calibration of the model based on experimental results. It also includes helper functions for data validation, indexing, and scaling related to lift tests.

Core Functions

add_lift_measurements_to_likelihood_from_saturation

def add_lift_measurements_to_likelihood_from_saturation(
    df_lift_test: pd.DataFrame,
    saturation: SaturationTransformation,
    time_varying_var_name: Optional[str] = None,
    model: Optional[pm.Model] = None,
    dist: Type[pm.Distribution] = pm.Gamma,
    name: str = "lift_measurements",
    get_indices: Callable[[pd.DataFrame, pm.Model], Indices] = exact_row_indices,
    variable_indexer_factory: Callable[[pm.Model, Indices], VariableIndexer] = create_variable_indexer,
) -> None:

Adds lift test observations to the likelihood of a PyMC model based on a defined saturation transformation. This function connects empirical lift data (df_lift_test) to the model’s estimated saturation curve.

Parameters:

  • df_lift_test (pd.DataFrame): DataFrame containing the lift test results. Must include columns:

    • Coordinates matching the model’s dimensions (e.g., ‘channel’, ‘geo’).

    • x: Spend level before the lift test intervention.

    • delta_x: Change in spend during the lift test.

    • delta_y: Observed change in the target variable during the lift test.

    • sigma: Standard deviation or uncertainty associated with delta_y.

  • saturation (SaturationTransformation): An object conforming to the SaturationTransformation protocol (defined in abacus.core.transformers), providing the saturation function and its parameter mapping.

  • time_varying_var_name (Optional[str], optional): Name of a model variable that modulates the saturation effect over time (if applicable). Defaults to None.

  • model (Optional[pm.Model], optional): The PyMC model context. If None, uses the current context. Defaults to None.

  • dist (Type[pm.Distribution], optional): The PyMC distribution to use for the lift test likelihood term (comparing observed delta_y to model-estimated lift). Defaults to pm.Gamma.

  • name (str, optional): Name for the likelihood variable added to the model. Defaults to "lift_measurements".

  • get_indices (Callable, optional): Function to map DataFrame rows to model coordinate indices. Defaults to exact_row_indices.

  • variable_indexer_factory (Callable, optional): Function to create a variable indexer based on model and indices. Defaults to create_variable_indexer.

add_saturation_observations

def add_saturation_observations(
    df_lift_test: pd.DataFrame,
    variable_mapping: VariableMapping,
    saturation_function: SaturationFunc,
    model: Optional[pm.Model] = None,
    dist: Type[pm.Distribution] = pm.Gamma,
    name: str = "lift_measurements",
    get_indices: Callable[[pd.DataFrame, pm.Model], Indices] = exact_row_indices,
    variable_indexer_factory: Callable[[pm.Model, Indices], VariableIndexer] = create_variable_indexer,
) -> None:

Lower-level function called by add_lift_measurements_to_likelihood_from_saturation. It directly adds the likelihood term comparing the model-estimated lift (derived from the saturation_function and its parameters indexed via variable_mapping) to the observed delta_y from the lift test data.

Parameters:

  • df_lift_test (pd.DataFrame): Lift test data (see above).

  • variable_mapping (VariableMapping): Dictionary mapping saturation function parameter names to model variable names.

  • saturation_function (SaturationFunc): The callable saturation function.

  • model, dist, name, get_indices, variable_indexer_factory: Same as in add_lift_measurements_to_likelihood_from_saturation.

Scaling Functions

scale_lift_measurements

def scale_lift_measurements(
    df_lift_test: pd.DataFrame,
    channel_col: str,
    channel_columns: List[Union[str, int]],
    channel_transform: Callable[[np.ndarray], np.ndarray],
    target_transform: Callable[[np.ndarray], np.ndarray],
) -> pd.DataFrame:

Scales the relevant columns (x, delta_x, delta_y, sigma) in the lift test DataFrame using provided transformation functions (typically fitted scalers from the main MMM preprocessing). This is necessary if the main MMM operates on scaled data.

Parameters:

  • df_lift_test (pd.DataFrame): The original lift test DataFrame.

  • channel_col (str): The name of the column in df_lift_test that identifies the channel.

  • channel_columns (List[Union[str, int]]): List of all channel names used in the main model (for alignment during scaling).

  • channel_transform (Callable): The transformation function (e.g., scaler.transform) used for channel spend/features in the main model.

  • target_transform (Callable): The transformation function used for the target variable in the main model.

Returns:

  • pd.DataFrame: A new DataFrame with scaled lift test data.

scale_channel_lift_measurements

def scale_channel_lift_measurements(...) -> pd.DataFrame:

Helper function used by scale_lift_measurements to scale the channel-related columns (x, delta_x).

scale_target_for_lift_measurements

def scale_target_for_lift_measurements(...) -> pd.Series:

Helper function used by scale_lift_measurements to scale the target-related columns (delta_y, sigma).

Helper Functions & Classes

exact_row_indices

def exact_row_indices(df: pd.DataFrame, model: pm.Model) -> Indices:

Maps rows in a DataFrame (df) containing coordinate values to the corresponding integer indices within a PyMC model’s coordinates. Raises UnalignedValuesError or KeyError if values or coordinates don’t match.

create_variable_indexer

def create_variable_indexer(model: pm.Model, indices: Indices) -> VariableIndexer:

Creates and returns a function (VariableIndexer) that, when given a variable name, returns the variable indexed according to the provided indices. Used within add_saturation_observations.

create_time_varying_saturation

def create_time_varying_saturation(...) -> tuple[SaturationFunc, VariableMapping]:

Wraps a base SaturationTransformation to include an additional time-varying modulation effect, returning the modified saturation function and variable mapping.

assert_is_subset

def assert_is_subset(required: set[str], available: set[str]) -> None:

Raises MissingValueError if the available set does not contain all elements from the required set.

assert_monotonic

def assert_monotonic(delta_x: pd.Series, delta_y: pd.Series) -> None:

Raises NonMonotonicError if the signs of delta_x and delta_y are inconsistent (i.e., if an increase in spend leads to a decrease in the target, or vice-versa), which violates assumptions for standard saturation functions.

Error Classes

  • UnalignedValuesError(Exception): Raised by exact_row_indices when coordinate values in the DataFrame don’t match the model’s coordinates.

  • MissingValueError(KeyError): Raised by assert_is_subset when required values are missing.

  • NonMonotonicError(ValueError): Raised by assert_monotonic when lift test data shows non-monotonic relationship.