Conventional wisdom says that predictive modelers
need to have an academic background in statistics, mathematics, computer
science, or engineering. A degree in one of these fields is best, but without a
degree, at a minimum, one should at least have taken statistics or mathematics
courses. Historically, one could not get a degree in predictive analytics, data
mining, or machine learning.
This has changed, however, and dozens of
universities now offer master’s degrees in predictive analytics. Additionally,
there are many variants of analytics degrees, including master’s degrees in
data mining, marketing analytics, business analytics, or machine learning. Some
programs even include a practicum so that students can learn to apply textbook
science to real-world problems.
One reason the real-world experience is so critical
for predictive modeling is that the science has tremendous limitations. Most
real-world problems have data problems never encountered in the textbooks. The
ways in which data can go wrong are seemingly endless; building the same
customer acquisition models even within the same domain requires different
approaches to data preparation, missing value imputation, feature creation, and
even modeling methods.
However, the principles of how one can solve
data problems are not endless; the experience of building models for several
years will prepare modelers to at least be able to identify when potential
problems may arise.
Surveys of top-notch predictive modelers reveal a
mixed story, however. While many have a science, statistics, or mathematics
background, many do not. Many have backgrounds in social science or humanities.
How can this be?
Consider a retail example. The retailer Target was
building predictive models to identify likely purchase behavior and to incentivize
future behavior with relevant offers. Andrew Pole, a Senior Manager of
Media and Database Marketing described how the company went about building
systems of predictive models at the Predictive Analytics World Conference in
2010. Pole described the importance of a combination of domain knowledge,
knowledge of predictive modeling, and most of all, a forensic mindset in
successful modeling of what he calls a “guest portrait.”
They developed a model to predict if a female
customer was pregnant. They noticed patterns of purchase behavior, what he
called “nesting” behavior. For example, women were purchasing cribs on average
90 days before the due date. Pole also observed that some products were
purchased at regular intervals prior to a woman’s due date. The company also
observed that if they were able to acquire these women as purchasers of other
products during the time before the birth of their baby, Target was able to
increase significantly the customer value; these women would continue to
purchase from Target after the baby was born based on their purchase behavior
before.
The key descriptive terms are “observed” and
“noticed.” This means the models were not built as black boxes. The
analysts asked, “does this make sense?” and leveraged insights gained from the
patterns found in the data to produce better predictive models. It undoubtedly
was iterative; as they “noticed” pat- terns, they were prompted to consider
other patterns they had not explicitly considered before (and maybe had not
even occurred to them before). This forensic mindset of analysts, noticing
interesting patterns and making connections between those patterns and how the
models could be used, is critical to successful modeling. It is rare that
predictive models can be fully defined before a project and modelers can anticipate
all of the most important patterns the model will find. So we shouldn’t be
surprised that we will be surprised, or put another way, we should expect
to be surprised.
This kind of mindset is not learned in a university
program; it is part of the personality of the individual. Good predictive
modelers need to have a forensic mindset and intellectual curiosity, whether or
not they understand the mathematics enough to derive the equations for linear
regression.
(This post first appeared in the Predictive Analytics Times)