Configuration Reference¶
This document provides a detailed reference for all parameters available in the Abacus MMM config.yaml file.
File Structure Overview¶
# Data handling
raw_data_granularity: weekly # or daily
train_test_ratio: 0.9 # optional, default 0.9
data_rows: # optional
total: 156 # optional
start_date: '2021-01-01' # optional
end_date: '2023-12-31' # optional
# Column mapping
ignore_cols: # optional
- unused_column_1
date_col: date # optional, default 'date'
target_col: sales # required
control_columns: # optional
- competitor_spend
- is_promo_week
media: # required, list of media channels
- display_name: TV
impressions_col: tv_impressions
spend_col: tv_spend
- display_name: Search
impressions_col: search_clicks # Can be clicks, impressions, etc.
spend_col: search_cost
# ... more media channels
# Model/Sampler Parameters
tune: 2000 # optional, default 2000
draws: 2000 # optional, default 2000
chains: 4 # optional, default 4
adstock_max_lag: 4 # optional, default 4
yearly_seasonality: 3 # optional, number of Fourier modes (e.g., 3) or null
target_accept: 0.95 # optional, default depends on PyMC sampler (e.g., 0.8 or 0.95)
seed: 42 # optional
# Custom Priors (Optional Section)
custom_priors:
intercept:
dist: Normal # e.g., Normal, LogNormal
kwargs: { mu: 0, sigma: 2 }
beta_channel:
dist: HalfNormal # e.g., HalfNormal, Normal
kwargs: { sigma: 2 }
alpha: # Adstock decay rate
dist: Beta
kwargs: { alpha: 1, beta: 3 }
lam: # Saturation parameter
dist: Gamma
kwargs: { alpha: 3, beta: 1 }
likelihood:
dist: Normal # e.g., Normal, StudentT
kwargs:
sigma: # Error term distribution
dist: HalfNormal
kwargs: { sigma: 2 }
gamma_control: # Control variable coefficients
dist: Normal
kwargs: { mu: 0, sigma: 2 }
gamma_fourier: # Seasonality coefficients
dist: Laplace
kwargs: { mu: 0, b: 1 }
Parameter Details¶
Data Handling Options¶
raw_data_granularityDescription: The time frequency of the input data rows.
Type:
stringAllowed Values:
"daily","weekly"Required: Yes
train_test_ratioDescription: Proportion of the data (ordered by date) to use for training. The rest is used for out-of-sample testing. Ignored if
data_rowswithstart_date/end_dateis used.Type:
floatRange:
0.5to1.0Default:
0.9Required: No
data_rowsDescription: Defines the specific subset of data to use based on row count or dates. Overrides
train_test_ratioif present.Type:
objectRequired: No
Properties:
total(Optionalint): Total number of rows to use from the start of the dataset.start_date(Optionalstring): Start date (inclusive, format YYYY-MM-DD).end_date(Optionalstring): End date (inclusive, format YYYY-MM-DD).
Column Definitions¶
ignore_colsDescription: A list of column names from the input CSV to exclude from any processing or modeling.
Type:
listofstringRequired: No
date_colDescription: The name of the column containing dates. Dates must be parsable, ideally in YYYY-MM-DD format.
Type:
stringDefault:
"date"Required: No (uses default if omitted)
target_colDescription: The name of the column representing the target variable (e.g., sales, conversions).
Type:
stringRequired: Yes
control_columnsDescription: A list of column names representing control variables or external factors (e.g., competitor activity, promotions, economic indicators). These are included as linear regressors in the model. Ensure these columns are numeric (encode categorical variables beforehand).
Type:
listofstringRequired: No
mediaDescription: A list defining each media channel to be included in the model.
Type:
listofobjectRequired: Yes
Object Properties (for each list item):
display_name(string, required): Name used for the channel in reports and plots.impressions_col(string, required): Column name containing the volume metric for the channel (e.g., impressions, clicks, GRPs). Used in the adstock/saturation transformation.spend_col(string, required): Column name containing the cost/spend data for the channel. Used for ROI calculations.
Model/Sampler Parameters¶
tuneDescription: Number of tuning (burn-in) steps for the MCMC sampler.
Type:
intDefault:
2000Required: No
drawsDescription: Number of sampling steps to perform after tuning for each chain.
Type:
intDefault:
2000Required: No
chainsDescription: Number of independent MCMC chains to run. Running multiple chains (>=2) is essential for assessing convergence (e.g., using R-hat).
Type:
intDefault:
4Required: No
adstock_max_lagDescription: The maximum number of time periods (in units of
raw_data_granularity) over which the effect of advertising spend can carry over (adstock).Type:
intRange:
1to52(Warning if > 26)Default:
4Required: No
yearly_seasonalityDescription: The number of Fourier modes to include for modelling yearly seasonality. Set to
nullor omit to disable this component.Type:
intornullRange: Typically
1to10if used.Required: No
target_acceptDescription: The target acceptance rate for adaptive MCMC samplers like NUTS. Controls the step size adaptation.
Type:
floatRange:
0.6to0.99Default: Depends on the PyMC sampler (often
0.8or0.95).Required: No
seedDescription: An integer seed for the random number generator used by the sampler, ensuring reproducibility.
Type:
intRequired: No
Custom Priors (custom_priors)¶
This entire section is optional. If omitted, the default priors defined in the model class are used.
Description: Allows overriding the default prior distributions for model parameters. Each key corresponds to a model parameter group.
Type:
objectRequired: No
Properties (Parameter Groups):
intercept: Prior for the base intercept term.beta_channel: Prior for the effectiveness coefficients of media channels.alpha: Prior for the decay rate parameter in the adstock transformation (typically Beta distribution).lam: Prior for the saturation parameter in the logistic saturation function (typically Gamma or HalfNormal).likelihood: Defines the observation distribution and its parameters (e.g.,sigmafor Normal likelihood).gamma_control: Prior for the coefficients of control variables (control_columns).gamma_fourier: Prior for the coefficients of Fourier seasonality terms (only used ifyearly_seasonalityis set in the main parameters).
Structure (for each parameter group):
dist(string, required): Name of the PyMC distribution (e.g., “Normal”, “HalfNormal”, “Beta”, “Gamma”, “Laplace”, “LogNormal”).kwargs(object, required): Dictionary of keyword arguments for the specified PyMC distribution (e.g.,mu,sigma,alpha,beta,b). Refer to PyMC documentation for valid parameters for each distribution.