Lift Test Integration (lift_test.py)¶
This module provides functions to incorporate empirical lift test data into the PyMC model’s likelihood, allowing for calibration of the model based on experimental results. It also includes helper functions for data validation, indexing, and scaling related to lift tests.
Core Functions¶
add_lift_measurements_to_likelihood_from_saturation¶
def add_lift_measurements_to_likelihood_from_saturation(
df_lift_test: pd.DataFrame,
saturation: SaturationTransformation,
time_varying_var_name: Optional[str] = None,
model: Optional[pm.Model] = None,
dist: Type[pm.Distribution] = pm.Gamma,
name: str = "lift_measurements",
get_indices: Callable[[pd.DataFrame, pm.Model], Indices] = exact_row_indices,
variable_indexer_factory: Callable[[pm.Model, Indices], VariableIndexer] = create_variable_indexer,
) -> None:
Adds lift test observations to the likelihood of a PyMC model based on a defined saturation transformation. This function connects empirical lift data (df_lift_test) to the model’s estimated saturation curve.
Parameters:
df_lift_test(pd.DataFrame): DataFrame containing the lift test results. Must include columns:Coordinates matching the model’s dimensions (e.g., ‘channel’, ‘geo’).
x: Spend level before the lift test intervention.delta_x: Change in spend during the lift test.delta_y: Observed change in the target variable during the lift test.sigma: Standard deviation or uncertainty associated withdelta_y.
saturation(SaturationTransformation): An object conforming to theSaturationTransformationprotocol (defined inabacus.core.transformers), providing the saturation function and its parameter mapping.time_varying_var_name(Optional[str], optional): Name of a model variable that modulates the saturation effect over time (if applicable). Defaults toNone.model(Optional[pm.Model], optional): The PyMC model context. IfNone, uses the current context. Defaults toNone.dist(Type[pm.Distribution], optional): The PyMC distribution to use for the lift test likelihood term (comparing observeddelta_yto model-estimated lift). Defaults topm.Gamma.name(str, optional): Name for the likelihood variable added to the model. Defaults to"lift_measurements".get_indices(Callable, optional): Function to map DataFrame rows to model coordinate indices. Defaults toexact_row_indices.variable_indexer_factory(Callable, optional): Function to create a variable indexer based on model and indices. Defaults tocreate_variable_indexer.
add_saturation_observations¶
def add_saturation_observations(
df_lift_test: pd.DataFrame,
variable_mapping: VariableMapping,
saturation_function: SaturationFunc,
model: Optional[pm.Model] = None,
dist: Type[pm.Distribution] = pm.Gamma,
name: str = "lift_measurements",
get_indices: Callable[[pd.DataFrame, pm.Model], Indices] = exact_row_indices,
variable_indexer_factory: Callable[[pm.Model, Indices], VariableIndexer] = create_variable_indexer,
) -> None:
Lower-level function called by add_lift_measurements_to_likelihood_from_saturation. It directly adds the likelihood term comparing the model-estimated lift (derived from the saturation_function and its parameters indexed via variable_mapping) to the observed delta_y from the lift test data.
Parameters:
df_lift_test(pd.DataFrame): Lift test data (see above).variable_mapping(VariableMapping): Dictionary mapping saturation function parameter names to model variable names.saturation_function(SaturationFunc): The callable saturation function.model,dist,name,get_indices,variable_indexer_factory: Same as inadd_lift_measurements_to_likelihood_from_saturation.
Scaling Functions¶
scale_lift_measurements¶
def scale_lift_measurements(
df_lift_test: pd.DataFrame,
channel_col: str,
channel_columns: List[Union[str, int]],
channel_transform: Callable[[np.ndarray], np.ndarray],
target_transform: Callable[[np.ndarray], np.ndarray],
) -> pd.DataFrame:
Scales the relevant columns (x, delta_x, delta_y, sigma) in the lift test DataFrame using provided transformation functions (typically fitted scalers from the main MMM preprocessing). This is necessary if the main MMM operates on scaled data.
Parameters:
df_lift_test(pd.DataFrame): The original lift test DataFrame.channel_col(str): The name of the column indf_lift_testthat identifies the channel.channel_columns(List[Union[str, int]]): List of all channel names used in the main model (for alignment during scaling).channel_transform(Callable): The transformation function (e.g.,scaler.transform) used for channel spend/features in the main model.target_transform(Callable): The transformation function used for the target variable in the main model.
Returns:
pd.DataFrame: A new DataFrame with scaled lift test data.
scale_channel_lift_measurements¶
def scale_channel_lift_measurements(...) -> pd.DataFrame:
Helper function used by scale_lift_measurements to scale the channel-related columns (x, delta_x).
scale_target_for_lift_measurements¶
def scale_target_for_lift_measurements(...) -> pd.Series:
Helper function used by scale_lift_measurements to scale the target-related columns (delta_y, sigma).
Helper Functions & Classes¶
exact_row_indices¶
def exact_row_indices(df: pd.DataFrame, model: pm.Model) -> Indices:
Maps rows in a DataFrame (df) containing coordinate values to the corresponding integer indices within a PyMC model’s coordinates. Raises UnalignedValuesError or KeyError if values or coordinates don’t match.
create_variable_indexer¶
def create_variable_indexer(model: pm.Model, indices: Indices) -> VariableIndexer:
Creates and returns a function (VariableIndexer) that, when given a variable name, returns the variable indexed according to the provided indices. Used within add_saturation_observations.
create_time_varying_saturation¶
def create_time_varying_saturation(...) -> tuple[SaturationFunc, VariableMapping]:
Wraps a base SaturationTransformation to include an additional time-varying modulation effect, returning the modified saturation function and variable mapping.
assert_is_subset¶
def assert_is_subset(required: set[str], available: set[str]) -> None:
Raises MissingValueError if the available set does not contain all elements from the required set.
assert_monotonic¶
def assert_monotonic(delta_x: pd.Series, delta_y: pd.Series) -> None:
Raises NonMonotonicError if the signs of delta_x and delta_y are inconsistent (i.e., if an increase in spend leads to a decrease in the target, or vice-versa), which violates assumptions for standard saturation functions.
Error Classes¶
UnalignedValuesError(Exception): Raised byexact_row_indiceswhen coordinate values in the DataFrame don’t match the model’s coordinates.MissingValueError(KeyError): Raised byassert_is_subsetwhen required values are missing.NonMonotonicError(ValueError): Raised byassert_monotonicwhen lift test data shows non-monotonic relationship.