I’ve been reminded recently of the overlap
between business intelligence and predictive analytics. Of course any reader of
this blog (or at least the title of the blog) knows I live in the world of data
mining (DM) and predictive analytics (PA), not the world of business
intelligence (BI). In general, I don’t make comments about BI because I am an
outsider looking in. Nevertheless, I view BI as a sibling to PA because we
share so much in common: we use the same data, often use similar metrics and
even sometimes use the same tools in our analyses.
I was interviewed by Victoria Garment of
Software Advice on the topic of testing accuracy of predictive models in
January, 2014 (I think I was first contacted about the interview in December,
2013). What I didn’t know was that John Elder and Karl Rexer, two good friends
and colleagues in this space, were interviewed as well. The resulting article, "3 Ways to Test the Accuracy of Your Predictive Models" posted on their Plotting Success blog was well written and generated quite a bit of buzz on twitter after it was
posted.
Prior to the interview, I had no knowledge of
Software Advice and after looking at their blog, I understand why: they are
clearly a BI blog. But after reading maybe a dozen posts, it is clear that we
are siblings, in particular sharing concepts and approaches in big data, data
science, staffing and talent acquisition. I've enjoyed going back to the blog.
The similarities of BI and PA are points I’ve tried to make in
talks I’ve given at eMetrics and performance management conferences. After
making suitable translations of terms, these two fields can understand each
other well. Two sample differences in terminology are described here.
First, one rarely hears the term KPI at a PA
conference, but will often hear it at BI conferences. If we use google as an
indicator of popularity of the term KPI,
- ' “predictive analytics” KPI' yielded a mere 103,000 hits on google, whereas
- ' “business intelligence” KPI' yielded 1,510,000 hits.
In PA, one is more likely to hear these ideas described as metrics or even features or derived variables that can be used as inputs to models are as a target variable.
As a second example, a “use case” is frequently
presented in BI conferences to explain a reason for creating a particular KPI or analysis. “Use
Cases” are rarely described in PA conferences; in PA we say “case studies”.
Back to google, we find
- ' "business intelligence" "use case" ' – 306,000 hits on google
- ' “predictive analytics” ”use case” ' – 58,800 hits on google
- ' “predictive analytics “case study” ' – 217,000 hits on google
Interestingly, the top two links for
“predictive analytics” “use case” from the search weren’t even predictive
analytics use cases or case studies. The second link of the two actually described how
predictive analytics is a use case for cloud computing.
The BI community, however, seems to embrace PA
and even consider it part of BI (much to the chagrin of the PA community, I
would think). According to the Wikipedia entry on BI, the following chart shows
topics that are a part of BI:
Interestingly, DM, PA, and even Prescriptive
Analytics are considered a part of BI. I must admit, at all the DM and PA
conferences I’ve attended, I’ve never heard attendees describe themselves as BI
practitioners. I have heard more cross-branding of BI and PA at other
conferences that include BI-specific material, like Performance Management and
Web Analytics conferences.
Contrast this with the PA Wikipedia page. This
taxonomy of fields related to PA is typical. I would personally include dashed
lines to Text Mining and maybe even Link Analysis or Social Networks as they are related
though not directly under PA. Interestingly, statistics falls under PA here,
I’m sure to the chagrin of statisticians! And, I would guess that at a
statistics conference, the attendees would not refer to themselves as
predictive modelers. But maybe they would consider themselves data scientists! Alas, that’s another topic
altogether. But that is the way these kinds of lists go; they are difficult to perfect and usually generate discussion over where the dividing lines occur.
This tendency to include fields are part of “our
own” is a trap most of us fall into: we tend to be myopic in our views of the
fields of study. It frankly reminds me of a map I remember hanging in my house
growing up in Natick, MA: “A Bostonian’s Idea of The United States of America”. Clearly, Cape Cod is far more important than Florida or even California!
Be that as it may, my final point is that BI and PA
are important but complementary disciplines. BI is a much larger field and
understandably so. PA is more of a specialty, but a specialty that is gaining
visibility and recognition as an important skill set to have in any
organization. Here’s to further collaboration in the future!
This reminds me about when I worked at my last job, the Predictive Analysts were categorized in the Business Analtyics department and the BI analysts were under IT. I'm looking for a Masters program but I'm not sure whether I want to pursue a career in Statistics, Predictive Analtyics, or Data Mining. Seeing that diagram made me realize I have more research to do. Awesome post!
ReplyDeleteMining of data in general terms can be elaborated as retrieving useful information or knowledge for further process of analyzing from various perspectives and summarizing in valuable information to be used for increasing revenue, cut cost, to gather competitive information on business or product. weka data mining
ReplyDeleteThis article I think is highlighting a challenge with cross-disciplinary collaboration in general -- the fact is, so much of what everybody does in the world in everyone's various fields of study overlaps quite a bit, but we all like to talk about it in our own way, with our own vocabulary. Learning to overcome this divide is very powerful, because I believe that some of the most creative and innovative developments occur when multiple fields of thought begin to effectively cross-pollinate.
ReplyDeleteSo, to point out the similar ideas in Business Intelligence and Predictive Analytics, as you have done, I hope will help bridge this divide in these areas.
It can be difficult to do, though, since sometimes Field A "uses" concepts from Field B but the people in Field B might somewhat begrudge that fact, considering the application of Field A to be "below" them somehow. The surface level understanding that Field A's people have of Field B might contribute to this, or perhaps the other way around, also.
So here's to changing that attitude -- Let's be willing to admit and accept that even the "middle-school level" version of the topic we are studying can be revolutionary to the "graduate level" research in a different field. A greater communication and awareness of developments in each other's fields can only be a good thing in the end.
Completely agree with you that each field builds up its own vocabulary and short-hand descriptions for the concepts important to the field. And, generally, there is very little incentive to build up a broader vocabulary because most of the time, practitioners in their own discipline don't communicate to other fields or even have the need to.
ReplyDeleteThe biggest advantage of this kind of cross-disciplinary communication is the cross-pollination of ideas that is so valuable in developing better approaches to solving problems. To cross-pollinate, however, we first have to cross-communicate!
So lifting up the glass....agree! Here's to changing the attitude! :)