My friend and colleague James Taylor asked me last week to comment on a question regarding statistics vs. predictive analytics. The bulk of my reply is on James' blog; my fully reply is here, re-worked from my initial response to clarify some points further.
I have always love reading the green "Sage" books, such as Understanding Regression Assumptions (Quantitative Applications in the Social Sciences)
or Missing Data (Quantitative Applications in the Social Sciences)
In data mining and predictive analytics, the data is king. These models often impute the models from the data (decision trees do this), or even if they only fit coefficients (like neural networks), it's the accuracy that matters rather than the coefficients. Often, in the data mining world, we won't have to explain precisely why individuals behave as they do so long as we can explain generally how they will behave. Model interpretation is often related to describing trends (sensitivity or importance of variables).
I have always found David Hand's summaries of the two disciplines very useful, such as this one here; I found that he had a healthy respect for both disciplines.
2 comments:
Thanks for the interesting post! Also, a nice article about statistics and data mining is this one: http://www.tdan.com/view-articles/5226
The bulk of my reply is on James' blog; my fully reply is here, re-worked from my initial response to clarify some points further. feng shui singapore
Post a Comment