Abacus Core: Legacy Imports (_legacy_imports.py)¶
This module serves as a repository for functions, classes, and enumerations that were originally part of the pymc-marketing library or its dependencies. They have been adapted and included directly within Abacus to remove the external dependency while preserving essential functionality for marketing mix modelling, particularly around adstock effects, saturation, and lift test data handling.
Enumerations
ConvMode
class ConvMode(str, Enum):
After = "After"
Before = "Before"
Overlap = "Overlap"
Defines the modes for applying 1D convolution, determining how boundaries are handled:
After: Trailing decay effect (typical adstock).Before: Leading effect (“excitement” factor).Overlap: Effect overlaps preceding and succeeding elements.
WeibullType
class WeibullType(str, Enum):
PDF = "PDF"
CDF = "CDF"
Specifies the type of Weibull distribution function to use for adstock calculation:
PDF: Probability Density Function.CDF: Cumulative Distribution Function (specifically,1 - CDFis used for decay).
Adstock Functions
These functions apply carryover effects to time-series data, commonly used for modelling advertising impact over time.
batched_convolution
def batched_convolution(
x,
w,
axis: int = 0,
mode: ConvMode | str = ConvMode.After,
):
Applies a 1D convolution across multiple batch dimensions in a vectorized manner. This is the core function used by other adstock implementations.
Parameters:
x: The array to convolve.w: The convolution weights (kernel). The last axis determines the number of steps (lag).axis(int): The axis ofxalong which to apply the convolution.mode(ConvMode | str): The convolution mode (After,Before,Overlap).
Returns:
The convolved array, with shape matching
x(considering broadcasting withw).
geometric_adstock
def geometric_adstock(
x,
alpha: float = 0.0,
l_max: int = 12,
normalize: bool = False,
axis: int = 0,
mode: ConvMode = ConvMode.After,
):
Applies a geometric adstock transformation, where the effect decays geometrically over time.
Parameters:
x: Input tensor.alpha(float): Retention rate (0 to 1).l_max(int): Maximum duration of carryover.normalize(bool): Whether to normalize weights to sum to 1.axis(int): Axis to apply convolution along.mode(ConvMode): Convolution mode.
Returns:
Transformed tensor.
delayed_adstock
def delayed_adstock(
x,
alpha: float = 0.0,
theta: int = 0,
l_max: int = 12,
normalize: bool = False,
axis: int = 0,
mode: ConvMode = ConvMode.After,
):
Applies a delayed adstock transformation, allowing the peak effect to be delayed.
Parameters:
x: Input tensor.alpha(float): Retention rate (0 to 1).theta(int): Delay of the peak effect (0 tol_max- 1).l_max(int): Maximum duration of carryover.normalize(bool): Whether to normalize weights.axis(int): Axis to apply convolution along.mode(ConvMode): Convolution mode.
Returns:
Transformed tensor.
weibull_adstock
def weibull_adstock(
x,
lam=1,
k=1,
l_max: int = 12,
axis: int = 0,
mode: ConvMode = ConvMode.After,
type: WeibullType | str = WeibullType.PDF,
):
Applies an adstock transformation based on the Weibull distribution (either PDF or CDF).
Parameters:
x: Input tensor.lam(float): Scale parameter (lambda > 0) of the Weibull distribution.k(float): Shape parameter (k > 0) of the Weibull distribution.l_max(int): Maximum duration of carryover.axis(int): Axis to apply convolution along.mode(ConvMode): Convolution mode.type(WeibullType | str): Type of Weibull function (PDForCDF).
Returns:
Transformed tensor.
Saturation Functions
These functions model the diminishing returns of an input variable, often used for advertising spend.
logistic_saturation
def logistic_saturation(x, lam: npt.NDArray[np.float64] | float = 0.5):
Applies a logistic saturation function: (1 - exp(-lam * x)) / (1 + exp(-lam * x)).
Parameters:
x: Input tensor.lam(floatorarray-like): Saturation parameter.
Returns:
Transformed tensor.
tanh_saturation
def tanh_saturation(
x: pt.TensorLike,
b: pt.TensorLike = 0.5,
c: pt.TensorLike = 0.5,
) -> pt.TensorVariable:
Applies a hyperbolic tangent (tanh) saturation function: b * tanh(x / (b * c)).
Parameters:
x: Input tensor.b(float): Saturation level (max effect).c(float): Controls the shape (related to cost per acquisition at the start).
Returns:
Transformed tensor.
tanh_saturation_baselined
def tanh_saturation_baselined(
x: pt.TensorLike,
x0: pt.TensorLike,
gain: pt.TensorLike = 0.5,
r: pt.TensorLike = 0.5,
) -> pt.TensorVariable:
Applies a reparameterised tanh saturation function based on a baseline point x0.
Parameters:
x: Input tensor.x0: Baseline input value.gain: Return on Ad Spend (ROAS) atx0, defined asf(x0) / x0.r: Overspend fraction atx0, defined asf(x0) / saturation_level.
Returns:
Transformed tensor.
michaelis_menten
def michaelis_menten(
x: float | np.ndarray | npt.NDArray[np.float64],
alpha: float | np.ndarray | npt.NDArray[np.float64],
lam: float | np.ndarray | npt.NDArray[np.float64],
) -> float | Any:
Applies the Michaelis-Menten saturation function: alpha * x / (lam + x).
Parameters:
x: Input value(s).alpha: Maximum effect (saturation level).lam: Michaelis constant (value ofxat which effect is half ofalpha).
Returns:
Transformed value(s).
Tanh Saturation Parameter Containers
TanhSaturationParameters
class TanhSaturationParameters(NamedTuple):
b: pt.TensorLike
c: pt.TensorLike
A NamedTuple to hold the standard parameters (b: saturation, c: shape/cost) for the tanh_saturation function. Includes a method .baseline(x0) to convert to TanhSaturationBaselinedParameters.
TanhSaturationBaselinedParameters
class TanhSaturationBaselinedParameters(NamedTuple):
x0: pt.TensorLike
gain: pt.TensorLike
r: pt.TensorLike
A NamedTuple to hold the baselined parameters (x0: baseline point, gain: ROAS at x0, r: overspend fraction at x0) for the tanh_saturation_baselined function. Includes methods .debaseline() to convert back to standard parameters and .rebaseline(x1) to change the baseline point.
Lift Test Scaling Functions
These functions are used to scale data from lift tests (experiments) so they can be incorporated into the model, typically alongside observational time-series data.
scale_channel_lift_measurements
def scale_channel_lift_measurements(
df_lift_test: pd.DataFrame,
channel_col: str,
channel_columns: list[str],
transform: Callable[[np.ndarray], np.ndarray],
) -> pd.DataFrame:
Scales the spend (x) and spend change (delta_x) columns for lift tests related to specific channels, using a provided scaling function (likely the same scaler used for the main channel data).
Parameters:
df_lift_test: DataFrame containing lift test results. Must include columnsx,delta_x, and the specifiedchannel_col.channel_col: The name of the column identifying the channel for each lift test row.channel_columns: A list of all channel names used in the model.transform: The scaling function (e.g., from asklearn.preprocessing.StandardScaler) to apply.
Returns:
A DataFrame with the same structure as the input, but with
xanddelta_xvalues scaled.
scale_target_for_lift_measurements
def scale_target_for_lift_measurements(
target: pd.Series,
transform: Callable[[np.ndarray], np.ndarray],
) -> pd.Series:
Scales a target-related Series (like delta_y or sigma from lift tests) using a provided scaling function (likely the same scaler used for the main target variable).
Parameters:
target: A pandas Series containing the values to scale (e.g.,df_lift_test['delta_y']).transform: The scaling function to apply.
Returns:
A pandas Series with the scaled values.
scale_lift_measurements
def scale_lift_measurements(
df_lift_test: pd.DataFrame,
channel_col: str,
channel_columns: list[str | int],
channel_transform: Callable[[np.ndarray], np.ndarray],
target_transform: Callable[[np.ndarray], np.ndarray],
) -> pd.DataFrame:
Applies scaling to all relevant columns (x, delta_x, delta_y, sigma) in a lift test DataFrame using appropriate channel and target scalers.
Parameters:
df_lift_test: DataFrame containing lift test results.channel_col: Name of the channel identifier column.channel_columns: List of all channel names in the model.channel_transform: Scaling function for channel spend (x,delta_x).target_transform: Scaling function for target-related values (delta_y,sigma).
Returns:
A DataFrame with scaled lift test data, ready for model input.
Utility Functions
create_new_spend_data
def create_new_spend_data(
spend: np.ndarray,
adstock_max_lag: int,
one_time: bool,
spend_leading_up: np.ndarray | None = None,
) -> np.ndarray:
Prepares spend data for calculating out-of-sample response curves or ROI, potentially padding it based on adstock lag.
Parameters:
spend: Array of spend values for each channel.adstock_max_lag: The maximum adstock lag used in the model.one_time: Boolean indicating if the spend is a one-time impulse or continuous.spend_leading_up: Optional array representing spend in periods before the mainspendarray (used for initial adstock state).
Returns:
A potentially padded array ready for adstock transformation.
(Private helper function _swap_columns_and_last_index_level is used internally by scaling functions for DataFrame manipulation.)