Thursday, July 24, 2014

Similarities and Differences Between Predictive Analytics and Business Intelligence

I’ve been reminded recently of the overlap between business intelligence and predictive analytics. Of course any reader of this blog (or at least the title of the blog) knows I live in the world of data mining (DM) and predictive analytics (PA), not the world of business intelligence (BI). In general, I don’t make comments about BI because I am an outsider looking in. Nevertheless, I view BI as a sibling to PA because we share so much in common: we use the same data, often use similar metrics and even sometimes use the same tools in our analyses.

I was interviewed by Victoria Garment of Software Advice on the topic of testing accuracy of predictive models in January, 2014 (I think I was first contacted about the interview in December, 2013). What I didn’t know was that John Elder and Karl Rexer, two good friends and colleagues in this space, were interviewed as well. The resulting article, "3 Ways to Test the Accuracy of Your Predictive Modelsposted on their Plotting Success blog was well written and generated quite a bit of buzz on twitter after it was posted.

Prior to the interview, I had no knowledge of Software Advice and after looking at their blog, I understand why: they are clearly a BI blog. But after reading maybe a dozen posts, it is clear that we are siblings, in particular sharing concepts and approaches in big data, data science, staffing and talent acquisition. I've enjoyed going back to the blog. 

The similarities of BI and PA are points I’ve tried to make in talks I’ve given at eMetrics and performance management conferences. After making suitable translations of terms, these two fields can understand each other well. Two sample differences in terminology are described here.

First, one rarely hears the term KPI at a PA conference, but will often hear it at BI conferences. If we use google as an indicator of popularity of the term KPI,
  • ' “predictive analytics” KPI' yielded a mere 103,000 hits on google, whereas
  • ' “business intelligence” KPI' yielded 1,510,000 hits.
In PA, one is more likely to hear these ideas described as metrics or even features or derived variables that can be used as inputs to models are as a target variable.

As a second example, a “use case” is frequently presented in BI conferences to explain a reason for creating a particular KPI or analysis. “Use Cases” are rarely described in PA conferences; in PA we say “case studies”. Back to google, we find
  • ' "business intelligence" "use case" ' – 306,000 hits on google
  • ' “predictive analytics” ”use case” ' – 58,800 hits on google
  • ' “predictive analytics “case study” ' – 217,000 hits on google

Interestingly, the top two links for “predictive analytics” “use case” from the search weren’t even predictive analytics use cases or case studies. The second link of the two actually described how predictive analytics is a use case for cloud computing.

The BI community, however, seems to embrace PA and even consider it part of BI (much to the chagrin of the PA community, I would think). According to the Wikipedia entry on BI, the following chart shows topics that are a part of BI:

Interestingly, DM, PA, and even Prescriptive Analytics are considered a part of BI. I must admit, at all the DM and PA conferences I’ve attended, I’ve never heard attendees describe themselves as BI practitioners. I have heard more cross-branding of BI and PA at other conferences that include BI-specific material, like Performance Management and Web Analytics conferences.

Contrast this with the PA Wikipedia page. This taxonomy of fields related to PA is typical. I would personally include dashed lines to Text Mining and maybe even Link Analysis or Social Networks as they are related though not directly under PA. Interestingly, statistics falls under PA here, I’m sure to the chagrin of statisticians! And, I would guess that at a statistics conference, the attendees would not refer to themselves as predictive modelers. But maybe they would consider themselves data scientists! Alas, that’s another topic altogether. But that is the way these kinds of lists go; they are difficult to perfect and usually generate discussion over where the dividing lines occur.

This tendency to include fields are part of “our own” is a trap most of us fall into: we tend to be myopic in our views of the fields of study. It frankly reminds me of a map I remember hanging in my house growing up in Natick, MA: “A Bostonian’s Idea of The United States of America”.  Clearly, Cape Cod is far more important than Florida or even California!

Be that as it may, my final point is that BI and PA are important but complementary disciplines. BI is a much larger field and understandably so. PA is more of a specialty, but a specialty that is gaining visibility and recognition as an important skill set to have in any organization. Here’s to further collaboration in the future!


Anonymous said...

This reminds me about when I worked at my last job, the Predictive Analysts were categorized in the Business Analtyics department and the BI analysts were under IT. I'm looking for a Masters program but I'm not sure whether I want to pursue a career in Statistics, Predictive Analtyics, or Data Mining. Seeing that diagram made me realize I have more research to do. Awesome post!

Unknown said...

Mining of data in general terms can be elaborated as retrieving useful information or knowledge for further process of analyzing from various perspectives and summarizing in valuable information to be used for increasing revenue, cut cost, to gather competitive information on business or product. weka data mining

skschmi said...

This article I think is highlighting a challenge with cross-disciplinary collaboration in general -- the fact is, so much of what everybody does in the world in everyone's various fields of study overlaps quite a bit, but we all like to talk about it in our own way, with our own vocabulary. Learning to overcome this divide is very powerful, because I believe that some of the most creative and innovative developments occur when multiple fields of thought begin to effectively cross-pollinate.

So, to point out the similar ideas in Business Intelligence and Predictive Analytics, as you have done, I hope will help bridge this divide in these areas.

It can be difficult to do, though, since sometimes Field A "uses" concepts from Field B but the people in Field B might somewhat begrudge that fact, considering the application of Field A to be "below" them somehow. The surface level understanding that Field A's people have of Field B might contribute to this, or perhaps the other way around, also.

So here's to changing that attitude -- Let's be willing to admit and accept that even the "middle-school level" version of the topic we are studying can be revolutionary to the "graduate level" research in a different field. A greater communication and awareness of developments in each other's fields can only be a good thing in the end.

Dean Abbott said...

Completely agree with you that each field builds up its own vocabulary and short-hand descriptions for the concepts important to the field. And, generally, there is very little incentive to build up a broader vocabulary because most of the time, practitioners in their own discipline don't communicate to other fields or even have the need to.

The biggest advantage of this kind of cross-disciplinary communication is the cross-pollination of ideas that is so valuable in developing better approaches to solving problems. To cross-pollinate, however, we first have to cross-communicate!

So lifting up the glass....agree! Here's to changing the attitude! :)