I just got back from the latest (and my first) eMetrics conference in San Jose, CA last week, and was very impressed by the practical nature of the conference. It was also a quite different experience for me to be in a setting where I knew very very few people there. I was there to co-present with Angel Morales "Behavioral Driven Marketing Attribution". Angel and I are co-founders of SmarterRemarketer, a new web analytics company, and this solution we described is just one nut we are trying to crack in the industry.
This post though is related to the overlap between web analytics and predictive analytics: very little right now. It really is a different world, and for many I spoke with, the mere mention of "predictive analytics" resulted in one of those unknowing looks back at me. In fairness, much that was spoken to me resulted in the same look!
One such topic was that of "use cases", a term used over and over in talks, but one that I don't encounter in the data mining world. We describe "case studies", but a "use case" is a smaller and more specific example of something interesting or unusual in how individuals or groups of individuals interact with web sites (I hope I got that right). The key though is that this is a thread of usage. In data mining, it is more typical that predictive models are built, and then to understand why the models are the way they are, one might trace through some of the more interesting branches of a tree or unusual variable combinations in something similar to this "use case" idea.
First, what to commend... The analyses I saw were quite good: customer segmentation, A/B testing, web page layout, some attribution, etc. There was a great keynote by Joe Megibow of Expedia describing how Expedia's entire web presence has changed in the past year. One of my favorite bloggers, Kevin Hillstrom of MineThatData fame gave a presentation praising the power of conditional probabilities (very nice!). Lastly, there was one more keynote by someone I had never heard of (not to my credit), but is obviously a great communicator and is well-known in the web analytics world, Avinash Kaushik. One idea I liked very much from his keynote was the long tail: the tail of the distribution of keywords that navigates to his website contains many times more visits than his top 10. In the data mining world, of course, this would push us to characterize these sparsely populated items differently so they produce more influence in any predictive models. Lots to think about.
But I digress. The lack of data mining and predictive analytics at this conference begs (at least from me) the question: why not? They are swimming in data, have important business questions that need to be solved, and clearly not all of these are being solved well enough. That will be the subject of my next post.