Abacus Core: Legacy Imports (_legacy_imports.py)

This module serves as a repository for functions, classes, and enumerations that were originally part of the pymc-marketing library or its dependencies. They have been adapted and included directly within Abacus to remove the external dependency while preserving essential functionality for marketing mix modelling, particularly around adstock effects, saturation, and lift test data handling.

Enumerations

ConvMode

class ConvMode(str, Enum):
    After = "After"
    Before = "Before"
    Overlap = "Overlap"

Defines the modes for applying 1D convolution, determining how boundaries are handled:

  • After: Trailing decay effect (typical adstock).

  • Before: Leading effect (“excitement” factor).

  • Overlap: Effect overlaps preceding and succeeding elements.


WeibullType

class WeibullType(str, Enum):
    PDF = "PDF"
    CDF = "CDF"

Specifies the type of Weibull distribution function to use for adstock calculation:

  • PDF: Probability Density Function.

  • CDF: Cumulative Distribution Function (specifically, 1 - CDF is used for decay).

Adstock Functions

These functions apply carryover effects to time-series data, commonly used for modelling advertising impact over time.

batched_convolution

def batched_convolution(
    x,
    w,
    axis: int = 0,
    mode: ConvMode | str = ConvMode.After,
):

Applies a 1D convolution across multiple batch dimensions in a vectorized manner. This is the core function used by other adstock implementations.

Parameters:

  • x: The array to convolve.

  • w: The convolution weights (kernel). The last axis determines the number of steps (lag).

  • axis (int): The axis of x along which to apply the convolution.

  • mode (ConvMode | str): The convolution mode (After, Before, Overlap).

Returns:

  • The convolved array, with shape matching x (considering broadcasting with w).


geometric_adstock

def geometric_adstock(
    x,
    alpha: float = 0.0,
    l_max: int = 12,
    normalize: bool = False,
    axis: int = 0,
    mode: ConvMode = ConvMode.After,
):

Applies a geometric adstock transformation, where the effect decays geometrically over time.

Parameters:

  • x: Input tensor.

  • alpha (float): Retention rate (0 to 1).

  • l_max (int): Maximum duration of carryover.

  • normalize (bool): Whether to normalize weights to sum to 1.

  • axis (int): Axis to apply convolution along.

  • mode (ConvMode): Convolution mode.

Returns:

  • Transformed tensor.


delayed_adstock

def delayed_adstock(
    x,
    alpha: float = 0.0,
    theta: int = 0,
    l_max: int = 12,
    normalize: bool = False,
    axis: int = 0,
    mode: ConvMode = ConvMode.After,
):

Applies a delayed adstock transformation, allowing the peak effect to be delayed.

Parameters:

  • x: Input tensor.

  • alpha (float): Retention rate (0 to 1).

  • theta (int): Delay of the peak effect (0 to l_max - 1).

  • l_max (int): Maximum duration of carryover.

  • normalize (bool): Whether to normalize weights.

  • axis (int): Axis to apply convolution along.

  • mode (ConvMode): Convolution mode.

Returns:

  • Transformed tensor.


weibull_adstock

def weibull_adstock(
    x,
    lam=1,
    k=1,
    l_max: int = 12,
    axis: int = 0,
    mode: ConvMode = ConvMode.After,
    type: WeibullType | str = WeibullType.PDF,
):

Applies an adstock transformation based on the Weibull distribution (either PDF or CDF).

Parameters:

  • x: Input tensor.

  • lam (float): Scale parameter (lambda > 0) of the Weibull distribution.

  • k (float): Shape parameter (k > 0) of the Weibull distribution.

  • l_max (int): Maximum duration of carryover.

  • axis (int): Axis to apply convolution along.

  • mode (ConvMode): Convolution mode.

  • type (WeibullType | str): Type of Weibull function (PDF or CDF).

Returns:

  • Transformed tensor.

Saturation Functions

These functions model the diminishing returns of an input variable, often used for advertising spend.

logistic_saturation

def logistic_saturation(x, lam: npt.NDArray[np.float64] | float = 0.5):

Applies a logistic saturation function: (1 - exp(-lam * x)) / (1 + exp(-lam * x)).

Parameters:

  • x: Input tensor.

  • lam (float or array-like): Saturation parameter.

Returns:

  • Transformed tensor.


tanh_saturation

def tanh_saturation(
    x: pt.TensorLike,
    b: pt.TensorLike = 0.5,
    c: pt.TensorLike = 0.5,
) -> pt.TensorVariable:

Applies a hyperbolic tangent (tanh) saturation function: b * tanh(x / (b * c)).

Parameters:

  • x: Input tensor.

  • b (float): Saturation level (max effect).

  • c (float): Controls the shape (related to cost per acquisition at the start).

Returns:

  • Transformed tensor.


tanh_saturation_baselined

def tanh_saturation_baselined(
    x: pt.TensorLike,
    x0: pt.TensorLike,
    gain: pt.TensorLike = 0.5,
    r: pt.TensorLike = 0.5,
) -> pt.TensorVariable:

Applies a reparameterised tanh saturation function based on a baseline point x0.

Parameters:

  • x: Input tensor.

  • x0: Baseline input value.

  • gain: Return on Ad Spend (ROAS) at x0, defined as f(x0) / x0.

  • r: Overspend fraction at x0, defined as f(x0) / saturation_level.

Returns:

  • Transformed tensor.


michaelis_menten

def michaelis_menten(
    x: float | np.ndarray | npt.NDArray[np.float64],
    alpha: float | np.ndarray | npt.NDArray[np.float64],
    lam: float | np.ndarray | npt.NDArray[np.float64],
) -> float | Any:

Applies the Michaelis-Menten saturation function: alpha * x / (lam + x).

Parameters:

  • x: Input value(s).

  • alpha: Maximum effect (saturation level).

  • lam: Michaelis constant (value of x at which effect is half of alpha).

Returns:

  • Transformed value(s).

Tanh Saturation Parameter Containers

TanhSaturationParameters

class TanhSaturationParameters(NamedTuple):
    b: pt.TensorLike
    c: pt.TensorLike

A NamedTuple to hold the standard parameters (b: saturation, c: shape/cost) for the tanh_saturation function. Includes a method .baseline(x0) to convert to TanhSaturationBaselinedParameters.


TanhSaturationBaselinedParameters

class TanhSaturationBaselinedParameters(NamedTuple):
    x0: pt.TensorLike
    gain: pt.TensorLike
    r: pt.TensorLike

A NamedTuple to hold the baselined parameters (x0: baseline point, gain: ROAS at x0, r: overspend fraction at x0) for the tanh_saturation_baselined function. Includes methods .debaseline() to convert back to standard parameters and .rebaseline(x1) to change the baseline point.

Lift Test Scaling Functions

These functions are used to scale data from lift tests (experiments) so they can be incorporated into the model, typically alongside observational time-series data.

scale_channel_lift_measurements

def scale_channel_lift_measurements(
    df_lift_test: pd.DataFrame,
    channel_col: str,
    channel_columns: list[str],
    transform: Callable[[np.ndarray], np.ndarray],
) -> pd.DataFrame:

Scales the spend (x) and spend change (delta_x) columns for lift tests related to specific channels, using a provided scaling function (likely the same scaler used for the main channel data).

Parameters:

  • df_lift_test: DataFrame containing lift test results. Must include columns x, delta_x, and the specified channel_col.

  • channel_col: The name of the column identifying the channel for each lift test row.

  • channel_columns: A list of all channel names used in the model.

  • transform: The scaling function (e.g., from a sklearn.preprocessing.StandardScaler) to apply.

Returns:

  • A DataFrame with the same structure as the input, but with x and delta_x values scaled.


scale_target_for_lift_measurements

def scale_target_for_lift_measurements(
    target: pd.Series,
    transform: Callable[[np.ndarray], np.ndarray],
) -> pd.Series:

Scales a target-related Series (like delta_y or sigma from lift tests) using a provided scaling function (likely the same scaler used for the main target variable).

Parameters:

  • target: A pandas Series containing the values to scale (e.g., df_lift_test['delta_y']).

  • transform: The scaling function to apply.

Returns:

  • A pandas Series with the scaled values.


scale_lift_measurements

def scale_lift_measurements(
    df_lift_test: pd.DataFrame,
    channel_col: str,
    channel_columns: list[str | int],
    channel_transform: Callable[[np.ndarray], np.ndarray],
    target_transform: Callable[[np.ndarray], np.ndarray],
) -> pd.DataFrame:

Applies scaling to all relevant columns (x, delta_x, delta_y, sigma) in a lift test DataFrame using appropriate channel and target scalers.

Parameters:

  • df_lift_test: DataFrame containing lift test results.

  • channel_col: Name of the channel identifier column.

  • channel_columns: List of all channel names in the model.

  • channel_transform: Scaling function for channel spend (x, delta_x).

  • target_transform: Scaling function for target-related values (delta_y, sigma).

Returns:

  • A DataFrame with scaled lift test data, ready for model input.

Utility Functions

create_new_spend_data

def create_new_spend_data(
    spend: np.ndarray,
    adstock_max_lag: int,
    one_time: bool,
    spend_leading_up: np.ndarray | None = None,
) -> np.ndarray:

Prepares spend data for calculating out-of-sample response curves or ROI, potentially padding it based on adstock lag.

Parameters:

  • spend: Array of spend values for each channel.

  • adstock_max_lag: The maximum adstock lag used in the model.

  • one_time: Boolean indicating if the spend is a one-time impulse or continuous.

  • spend_leading_up: Optional array representing spend in periods before the main spend array (used for initial adstock state).

Returns:

  • A potentially padded array ready for adstock transformation.


(Private helper function _swap_columns_and_last_index_level is used internally by scaling functions for DataFrame manipulation.)