Tuesday, 30 January 2018

Marketing Mix Modeling: Sales Driver Analysis, Media Selection Model and Advertisement Elasticity using R


Marketing Mix Modeling: Sales Driver Analysis, Media Selection Model and Advertisement Elasticity using R

In my previous video I was able give an overview of Marketing Mix Modeling in R. Now I will show How to see Marketing Mix Modeling and choose the best Media which is good for the organization i.e Media Selection Model.

What is Media Selection Model?

Media Selection Model is Marketing Terminology; if you ask a statistician he will call it as Linear feature selection. The model will help you to come up with combinations of Medias which are significant and will effectively drive sales by choosing.

How is it different from Sales Driver Analysis?

Sales Driver Analysis give you a insight what is driving you sales and its contribution. Marketing Media Selection help you to choose the best combination.

Here I will carry out the stepwise process to feature selection.

I am using the dataset from kaggle you can find the dataset here:https://www.kaggle.com/sazid28/advertising.csv/data

Now let’s see the dataset

head(sales.data)
##   X    TV radio newspaper sales
## 1 1 230.1  37.8      69.2  22.1
## 2 2  44.5  39.3      45.1  10.4
## 3 3  17.2  45.9      69.3   9.3
## 4 4 151.5  41.3      58.5  18.5
## 5 5 180.8  10.8      58.4  12.9
## 6 6   8.7  48.9      75.0   7.2

As X is not useful. we will remove it and run the linear model

df<-sales.data[-1]

fit.lm1<-lm(formula = sales~TV,data = df)
fit.lm2<-update(fit.lm1,.~.+radio)
fit.lm3<-update(fit.lm2,.~.+newspaper)

Sales Driver Analysis

Lets run the summary of linear model

summary(fit.lm3)
## 
## Call:
## lm(formula = sales ~ TV + radio + newspaper, data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.8277 -0.8908  0.2418  1.1893  2.8292 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.938889   0.311908   9.422   <2e-16 ***
## TV           0.045765   0.001395  32.809   <2e-16 ***
## radio        0.188530   0.008611  21.893   <2e-16 ***
## newspaper   -0.001037   0.005871  -0.177     0.86    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.686 on 196 degrees of freedom
## Multiple R-squared:  0.8972, Adjusted R-squared:  0.8956 
## F-statistic: 570.3 on 3 and 196 DF,  p-value: < 2.2e-16

If you look at the model i.e complete model you will be able to see which are significant and which are not. In this model TV and Radio are highly significant and the newspaper is not significant, which explain me the TV and radio are driving the model.

Media Selection Model

Now lets run all the model stepwise

y=b0+b1x1+error

then

y=b0+b1x1+b2x2+error

and so on. Now see the combined summary of all the models

mtable(fit.lm1,fit.lm2,fit.lm3)
## 
## Calls:
## fit.lm1: lm(formula = sales ~ TV, data = df)
## fit.lm2: lm(formula = sales ~ TV + radio, data = df)
## fit.lm3: lm(formula = sales ~ TV + radio + newspaper, data = df)
## 
## =========================================================
##                     fit.lm1      fit.lm2      fit.lm3    
## ---------------------------------------------------------
##   (Intercept)        7.033***     2.921***     2.939***  
##                     (0.458)      (0.294)      (0.312)    
##   TV                 0.048***     0.046***     0.046***  
##                     (0.003)      (0.001)      (0.001)    
##   radio                           0.188***     0.189***  
##                                  (0.008)      (0.009)    
##   newspaper                                   -0.001     
##                                               (0.006)    
## ---------------------------------------------------------
##   R-squared          0.612        0.897        0.897     
##   adj. R-squared     0.610        0.896        0.896     
##   sigma              3.259        1.681        1.686     
##   F                312.145      859.618      570.271     
##   p                  0.000        0.000        0.000     
##   Log-likelihood  -519.046     -386.197     -386.181     
##   Deviance        2102.531      556.914      556.825     
##   AIC             1044.091      780.394      782.362     
##   BIC             1053.986      793.587      798.854     
##   N                200          200          200         
## =========================================================

Now in this model normally we look R^2 but you can see in the table that Model “fit.lm2” and “fit.lm3” both are having a same R^2 of 89.7% in that case we can look into the lowest AIC and BIC. We can also look at Mellow’s Cp if you are not able to get form AIC and BIC. Here AIC and BIC are lowest in fit.lm2 model and we can choose this as the best model out of various combinations.

You can also see the intercept lower the intercept better the explanation by the independent variables.

What is the challenge?

Now looking at this we can say the model is 89.7% and has 10.3% randomness. This is up to a data scientist to accept the model to be deployed or not.

Conclusion

Along with MMM, we have to use ROI Analysis, Adstock Analysis for the better decision.