# Lift Test Integration (`lift_test.py`) This module provides functions to incorporate empirical lift test data into the PyMC model's likelihood, allowing for calibration of the model based on experimental results. It also includes helper functions for data validation, indexing, and scaling related to lift tests. ## Core Functions ### `add_lift_measurements_to_likelihood_from_saturation` ```python def add_lift_measurements_to_likelihood_from_saturation( df_lift_test: pd.DataFrame, saturation: SaturationTransformation, time_varying_var_name: Optional[str] = None, model: Optional[pm.Model] = None, dist: Type[pm.Distribution] = pm.Gamma, name: str = "lift_measurements", get_indices: Callable[[pd.DataFrame, pm.Model], Indices] = exact_row_indices, variable_indexer_factory: Callable[[pm.Model, Indices], VariableIndexer] = create_variable_indexer, ) -> None: ``` Adds lift test observations to the likelihood of a PyMC model based on a defined saturation transformation. This function connects empirical lift data (`df_lift_test`) to the model's estimated saturation curve. **Parameters:** - `df_lift_test` (`pd.DataFrame`): DataFrame containing the lift test results. Must include columns: - Coordinates matching the model's dimensions (e.g., 'channel', 'geo'). - `x`: Spend level before the lift test intervention. - `delta_x`: Change in spend during the lift test. - `delta_y`: Observed change in the target variable during the lift test. - `sigma`: Standard deviation or uncertainty associated with `delta_y`. - `saturation` (`SaturationTransformation`): An object conforming to the `SaturationTransformation` protocol (defined in `abacus.core.transformers`), providing the saturation function and its parameter mapping. - `time_varying_var_name` (`Optional[str]`, optional): Name of a model variable that modulates the saturation effect over time (if applicable). Defaults to `None`. - `model` (`Optional[pm.Model]`, optional): The PyMC model context. If `None`, uses the current context. Defaults to `None`. - `dist` (`Type[pm.Distribution]`, optional): The PyMC distribution to use for the lift test likelihood term (comparing observed `delta_y` to model-estimated lift). Defaults to `pm.Gamma`. - `name` (`str`, optional): Name for the likelihood variable added to the model. Defaults to `"lift_measurements"`. - `get_indices` (`Callable`, optional): Function to map DataFrame rows to model coordinate indices. Defaults to `exact_row_indices`. - `variable_indexer_factory` (`Callable`, optional): Function to create a variable indexer based on model and indices. Defaults to `create_variable_indexer`. ### `add_saturation_observations` ```python def add_saturation_observations( df_lift_test: pd.DataFrame, variable_mapping: VariableMapping, saturation_function: SaturationFunc, model: Optional[pm.Model] = None, dist: Type[pm.Distribution] = pm.Gamma, name: str = "lift_measurements", get_indices: Callable[[pd.DataFrame, pm.Model], Indices] = exact_row_indices, variable_indexer_factory: Callable[[pm.Model, Indices], VariableIndexer] = create_variable_indexer, ) -> None: ``` Lower-level function called by `add_lift_measurements_to_likelihood_from_saturation`. It directly adds the likelihood term comparing the model-estimated lift (derived from the `saturation_function` and its parameters indexed via `variable_mapping`) to the observed `delta_y` from the lift test data. **Parameters:** - `df_lift_test` (`pd.DataFrame`): Lift test data (see above). - `variable_mapping` (`VariableMapping`): Dictionary mapping saturation function parameter names to model variable names. - `saturation_function` (`SaturationFunc`): The callable saturation function. - `model`, `dist`, `name`, `get_indices`, `variable_indexer_factory`: Same as in `add_lift_measurements_to_likelihood_from_saturation`. ## Scaling Functions ### `scale_lift_measurements` ```python def scale_lift_measurements( df_lift_test: pd.DataFrame, channel_col: str, channel_columns: List[Union[str, int]], channel_transform: Callable[[np.ndarray], np.ndarray], target_transform: Callable[[np.ndarray], np.ndarray], ) -> pd.DataFrame: ``` Scales the relevant columns (`x`, `delta_x`, `delta_y`, `sigma`) in the lift test DataFrame using provided transformation functions (typically fitted scalers from the main MMM preprocessing). This is necessary if the main MMM operates on scaled data. **Parameters:** - `df_lift_test` (`pd.DataFrame`): The original lift test DataFrame. - `channel_col` (`str`): The name of the column in `df_lift_test` that identifies the channel. - `channel_columns` (`List[Union[str, int]]`): List of all channel names used in the main model (for alignment during scaling). - `channel_transform` (`Callable`): The transformation function (e.g., `scaler.transform`) used for channel spend/features in the main model. - `target_transform` (`Callable`): The transformation function used for the target variable in the main model. **Returns:** - `pd.DataFrame`: A new DataFrame with scaled lift test data. ### `scale_channel_lift_measurements` ```python def scale_channel_lift_measurements(...) -> pd.DataFrame: ``` Helper function used by `scale_lift_measurements` to scale the channel-related columns (`x`, `delta_x`). ### `scale_target_for_lift_measurements` ```python def scale_target_for_lift_measurements(...) -> pd.Series: ``` Helper function used by `scale_lift_measurements` to scale the target-related columns (`delta_y`, `sigma`). ## Helper Functions & Classes ### `exact_row_indices` ```python def exact_row_indices(df: pd.DataFrame, model: pm.Model) -> Indices: ``` Maps rows in a DataFrame (`df`) containing coordinate values to the corresponding integer indices within a PyMC model's coordinates. Raises `UnalignedValuesError` or `KeyError` if values or coordinates don't match. ### `create_variable_indexer` ```python def create_variable_indexer(model: pm.Model, indices: Indices) -> VariableIndexer: ``` Creates and returns a function (`VariableIndexer`) that, when given a variable name, returns the variable indexed according to the provided `indices`. Used within `add_saturation_observations`. ### `create_time_varying_saturation` ```python def create_time_varying_saturation(...) -> tuple[SaturationFunc, VariableMapping]: ``` Wraps a base `SaturationTransformation` to include an additional time-varying modulation effect, returning the modified saturation function and variable mapping. ### `assert_is_subset` ```python def assert_is_subset(required: set[str], available: set[str]) -> None: ``` Raises `MissingValueError` if the `available` set does not contain all elements from the `required` set. ### `assert_monotonic` ```python def assert_monotonic(delta_x: pd.Series, delta_y: pd.Series) -> None: ``` Raises `NonMonotonicError` if the signs of `delta_x` and `delta_y` are inconsistent (i.e., if an increase in spend leads to a decrease in the target, or vice-versa), which violates assumptions for standard saturation functions. ### Error Classes - `UnalignedValuesError(Exception)`: Raised by `exact_row_indices` when coordinate values in the DataFrame don't match the model's coordinates. - `MissingValueError(KeyError)`: Raised by `assert_is_subset` when required values are missing. - `NonMonotonicError(ValueError)`: Raised by `assert_monotonic` when lift test data shows non-monotonic relationship.