Let’s now use the Meridian library with data. The first step is to install Meridian with either pip or poetry : pip install google-meridian
or poetry add google-meridian
We will then get the data and start defining columns which are of interest to us.
import pandas as pdraw_df = pd.read_csv("https://raw.githubusercontent.com/sibylhe/mmm_stan/main/data.csv")
For the control variables, we will use all of the holidays variables in the dataset. Our KPI will be sales, and the time granularity will be weekly.
Next, we will select our Media variables. Meridian makes a difference between media data and media spends:
- Media data (or “execution”) : Contains the exposure metric per channel and time span (such as impressions per time period). Media values must not contain negative values. When exposure metrics are not available, use the same as in media spend.
- Media spend : Containing the media spending per channel and time span. The media data and media spend must have the same dimensions.
When should you use spends vs execution ?
It is usually recommended to use exposure metrics as direct inputs into the model as they represent how media activity has been consumed by consumers. However, no one plans a budget using execution data. If you use MMM to optimize budget planning, my advice would be to use data you control, ie spends.
Loading the data
In our use case, we will only use the spends from 5 channels: Newspaper, Radio, TV, Social Media and Online Display.
# 1. control variables
CONTROL_COLS = [col for col in raw_df.columns if 'hldy_' in col]# 2. media variables
spends_mapping = {
"mdsp_nsp": "Newspaper",
"mdsp_audtr": "Radio",
"mdsp_vidtr": "TV",
"mdsp_so": "Social Media",
"mdsp_on": "Online Display",
}
MEDIA_COLS = list(spends_mapping.keys())
# 3. sales variables
SALES_COL = "sales"
# 4. Date column
DATE_COL = "wk_strt_dt"
data_df = raw_df[[DATE_COL, SALES_COL, *MEDIA_COLS, *CONTROL_COLS]]
data_df[DATE_COL] = pd.to_datetime(data_df[DATE_COL])
We will then map the columns to their data type so that Meridian can understand them. The CoordToColumns
object will help us do that, and requires mandatory information :
time
: the time column (usually a date, day or week)controls
: the control variableskpi
: the response we want the model to predict. In our case, we will give it the valuerevenue
since we want to predict sales.media
: the media execution data (impressions, clicks, etc.) or the spends if we have no execution data. In our case, we will put the spends.media_spends
: the media spends.
There several other parameters which can be used, namely the geo
parameter if we have several groups (geographies for ex.), population
, reach
, frequency
. Details about these are out of this scope but the documentation can be found here.
We can therefore create our column mappings :
from meridian.data import loadcoord_to_columns = load.CoordToColumns(
time=DATE_COL,
controls=CONTROL_COLS,
kpi=SALES_COL,
media=MEDIA_COLS,
media_spend=MEDIA_COLS,
)
Next, we will use our dataframe and the columns mappings to create a data object to be used by the model.
loader = load.DataFrameDataLoader(
df=data_df,
kpi_type='revenue',
coord_to_columns=coord_to_columns,
media_to_channel=spends_mapping,
media_spend_to_channel=spends_mapping
)
data = loader.load()
Exploring the data
Sales
fig, ax = plt.subplots()
data_df.set_index("wk_strt_dt")[SALES_COL].plot(color=COLORS[1], ax=ax)
ax.set(title="Sales", xlabel='date', ylabel="sales");
fig.tight_layout();
There seems to be a nice seasonality with peaks around Christmas. Trend is overall constant with a level oscillating between 50 and 150M.
Media Spends
fig, ax = plt.subplots(5, figsize=(20,30))for axis, channel in zip(ax, spends_columns_raw):
data_df.set_index("wk_strt_dt")[channel].plot(ax=axis, color=COLORS[1])
axis.legend(title="Channel", fontsize=12)
axis.set(title=spends_mapping[channel], xlabel="Date", ylabel="Spend");
fig.tight_layout()
We observe a clearly decreasing trend for newspaper correlated with an increasing trend for Social Media. Spends seem to be also increasing at or just before Christmas.
Specifying the Model
Building the model and choosing the right parameters can be quite complex as there are a lot of options available. I will share here my findings but feel free to explore by yourself.
The first part is to choose the priors for our media spends. We will use the PriorDistribution
class which allows us to define several variables. You can change the priors of almost any parameter of the model (mu, tau, gamma, beta, etc…), but for now we will only focus on the beta which are the coefficients of our media variables. My recommendation is, if you are using spends only, to use the beta_m
. You can choose the roi_m
or mroi_m
but you will need to adapt the code to use a different prior.
import tensorflow_probability as tfp
from meridian import constants
from meridian.model import prior_distributionprior = prior_distribution.PriorDistribution(
beta_m=tfp.distributions.HalfNormal(
0.2,
name=constants.BETA_M,
# If you want to use the ROI vision instead of the coefficients approach
# roi_m=tfp.distributions.HalfNormal(
# 0.2,
# name=constants.ROI_M
)
)
When defining the model specifications, you will then be able to define :
- the priors (cf above).
max_len
: the maximum number of lag periods (≥ `0`) to
include in the Adstock calculation. I recommend choosing between 2 and 6.paid_media_prior_type
: if you choose to model thebeta_m
, then choosecoefficient
. Else, chooseroi
ormroi
.knots
: Meridian applies automatic seasonality adjustment through a time-varying intercept approach, controlled by theknots
value. You can set a value of 1 (constant intercept, no seasonality modelling), or equal to a given number that must be lower than the length of the data. A low value could lead to a low baseline, a high value could lead to overfitting and lead to a baseline eating everything. I recommend to set it to 10% of the number of data points
It is also possible to define a train-test split to avoid overfitting via the holdout_id
parameter. I won’t cover it here, but it is a best practice to have this split done for model selection.
In a nutshell:
from meridian.model import spec
from meridian.model import modelmodel_spec = spec.ModelSpec(
prior=prior,
max_lag=6,
knots=int(0.1*len(data_df)),
paid_media_prior_type='coefficient',
)
mmm = model.Meridian(input_data=data, model_spec=model_spec)
Running the model
Fitting the model can be slow if you have a large number of data points and variables. I recommend to start with 2 chains, and leave the default number of samples:
mmm.sample_prior(500)
mmm.sample_posterior(n_chains=2, n_adapt=500, n_burnin=500, n_keep=1000)
Model Diagnostics
Once the model is done running, we will perform a series of checks to ensure that we can use it confidently.
- R-hat
R-hat close to 1.0 indicate convergence. R-hat < 1.2 indicates approximate convergence and is a reasonable threshold for many problems.
A lack of convergence typically has one of two culprits. Either the model is very poorly misspecified for the data, which can be in the likelihood (model specification) or in the prior. Or, there is not enough burnin, meaning n_adapt + n_burnin is not large enough.
from meridian.analysis import visualizermodel_diagnostics = visualizer.ModelDiagnostics(mmm)
model_diagnostics.plot_rhat_boxplot()
We see that all r-hat values are below 1.02, which indicates no divergence or issue during training.
2. Model trace
The model trace contains the sample values from the chains. A nice trace is when the two posterior distributions (as we have 2 chains) for a given parameter overlap nicely. In the diagram below, you can see that blue and black lines on the left-hand side nicely overlap :
3. Prior vs Posterior distributions
To know if our model has learned during fitting, we will compare prior vs posterior distribution. If they perfectly overlap, this means that our model has not shifted its prior distributions and therefore has probably not learned anything, or that the priors were misspecified. To make sure our model has learned, we would like to see a slight shift in distributions :
We clearly that that the priors and posteriors don’t overlap. For TV and Social Media for ex, we see that the orange HalfNormal priors have shifted to the blue quasi-Normal distributions.
4. R2 and Model Fit
Finally, we will use metrics to evaluate our model fit. You probably know about metrics like R2, MAPE, etc., so let’s have a look at those values:
model_diagnostics = visualizer.ModelDiagnostics(mmm)
model_diagnostics.predictive_accuracy_table()
Obviously, a R2 of 0.54 is not great at all. We could improve that by either adding more knots in the baseline, or more data to the model, or play with the priors to try to capture more information.
Let’s now plot the model:
model_fit = visualizer.ModelFit(mmm)
model_fit.plot_model_fit()
Contributions of media to sales
Remember that one of the objectives of MMM is to provide you with media contributions vs your sales. This is what we will look at with a waterfall diagram :
media_summary = visualizer.MediaSummary(mmm)
media_summary.plot_contribution_waterfall_chart()
What we usually expect is to have a baseline between 60 and 80%. Keep in mind that this value can be very sensitive and depend on the model specification and parameters. I encourage you to play with different knots
values and priors and see the impact it can have on the model.
Spends vs Contributions
The spend versus contribution chart compares the spend and incremental revenue or KPI split between channels. The green bar highlights the return on investment (ROI) for each channel.
media_summary.plot_roi_bar_chart()
We see that the highest ROI comes from Social Media, followed by TV. But this is also where the uncertainty interval is the largest. MMM is not an exact answer : it gives you values AND uncertainty associated to those. My opinion here is that uncertainty intervals are very large. Maybe we should use more sampling steps or add more variables to the model.
Optimizing our budget
Remember that one of the objectives of the MMM is to propose an optimal allocation of spends to maximize revenue. This can be done first by looking at what we call response curves. Response curves describe the relationship between marketing spend and the resulting incremental revenue.
We can see there that :
- incremental revenue increases as the spend increases
- for some touchpoints like newspaper, growth is slower, meaning a 2x increase in spend will not translate to a 2x incremental revenue.
The goal of the optimization will be to take those curves and navigate to find the best combination of value that maximize our sales equation. We know that sales = f(media, control, baseline), and we are trying to find the media* values that maximize our function.
We can choose between several optimization problems, for ex:
- How can I reach the sames sales level with less budget ?
- Given the same budget, what is the maximum revenue I can get ?
Let’s use Meridian to optimize our budget and maximize sales (scenario 1). We will use the default parameters here but it is possible to fine-tune the constraints on each channel to limit the search scope.
from meridian.analysis import optimizerbudget_optimizer = optimizer.BudgetOptimizer(mmm)
optimization_results = budget_optimizer.optimize()
# Plot the response curves before and after
optimization_results.plot_response_curves()
We can see that the optimizer recommends to decrease the spends for Newspaper, Online Display and recommends to increase spends for Radio, Social Media and TV.
How does it translate in terms of revenue ?
3% increase in revenue just by rebalancing our budget ! Of course this conclusion is a bit hasty. First, replaying the past is easy. You have no guarantee that your baseline sales (60%) would behave the same next year. Think of Covid. Second, our model does not account for interactions between channels. What we have used here is a simple additional model, but some approaches use a log-log multiplicative model to account for interactions between variables. Third, there is uncertainty in our response curves which is not handled by the optimizer, as it only takes the average response curve for each channel. Response curves with uncertainty look like the picture below and optimizing under uncertainty becomes a lot more complex :
However, it still gives you an idea of where you are maybe over or under-spending.
MMM is a complex but powerful tool that can uncover insights from your marketing data, help you understand your marketing efficiency and assist you in budget planning. The new methods relying on Bayesian inference provide nice feature such as adstock and saturation modelling, incorporation of geographic-level data, uncertainty levels and optimization capabilities. Happy coding.