Monday, April 25, 2016

Tracking Model Performance Over Time

Context

Most introductory data mining texts include substantial coverage of model testing. Various methods of assessing true model performance (holdout testing, k-fold cross validation, etc.) are usually explained, perhaps with some important variants, such as stratification of the testing samples.

Generally, all of this exposition is aimed at in-time analysis: Model development data may span multiple time periods, but the testing is more or less blind to this: all periods are treated as fair game and mixed together. This is fine for model development. Once predictive models are deployed, however, it is desirable to continue testing to track model performance over time. Models which degrade over time need to be adjusted or replaced.


Subtleties of Testing Over Time

Nearly all production model evaluation is performed with new out-of-time data. As new periods of observed outcomes become available, they are used to calculate running performance measures. As far it goes, focusing on the actual performance metric makes sense. In my experience, though, some clients become distracted by movement in the independent variables or in the predicted or actual outcome distributions, in isolation. It is important to understand the dynamic of these changes to fully understand model performance over time.

For the sake of a thought experiment, consider a very simply problem with one independent variable, and one target variable, both real numbers. Historically, the distribution of each of these variables has been confined to specific ranges. A predictive model has been constructed as a linear regression which attempts to anticipate the target variable, using only the input of the single independent variable (and a constant). Assume that errors observed in the development data have been small and otherwise unremarkable (they are distributed normally, their magnitude is relatively constant across the range of the independent variable, there is no obvious pattern to them and so forth).

Once this model is deployed, it is executed on all future cases drawn from the relevant statistical universe, and predictions are saved for further analysis. Likewise, actual outcomes are recorded as they become available. At the conclusion of each future time period, model performance within that period is examined.

Consider the simplest change to well-developed model: the distribution of the independent variable remains the same, but the actual outcomes begin to depart the regression line. Any number of changes could be taking place in the output distribution, but the predicted distribution (the regression line) cannot move since it is entirely defined by the independent variable, which in this case is stable. By definition, model performance is degrading. This circumstance is easy to diagnose: the dynamic linking the target and independent variables is changing, hence a new model is necessary to restore performance.

What happens, though, when the independent variable begins to migrate? There are two possible effects (in reality, some combination of these extremes is likely): 1. The distribution of actual outcomes will either shift to appropriately match the change ("the dots march along the regression line"), or 2. The distribution of actual outcomes does not shift to match the change. In the first case, the model continues to correctly identify the relationship between the target and the independent variable, and model performance will more-or-less endure. In the second case, reality begins to wander from the model and performance deteriorates. Notice that, in the second case, the actual outcome distribution may or may not change noticeably- either way, the model no longer correctly anticipates reality and needs to be updated.


Conclusion

The example used here was deliberately chosen to be simple, for illustrations' sake. Qualitatively, though, the same basic behaviors are exhibited by much more complex models. Models featuring multiple independent variables or employing complex transformations (neural networks, decision trees, etc.) obey the same fundamental dynamic. Given the sensitivity of nonlinear models to each of their independent variables, a migration in even one of them may provoke the changes described above. Consideration of the components of this interplay in isolation only serves to confuse: Changes over time can only be understood as part of the larger whole.

16 comments:

Suseela Susiee said...




This information is impressive; I am inspired with your post writing style & how continuously you describe this topic. After reading your post, thanks for taking the time to discuss this, I feel happy about it and I love learning more about this topic.

Digital Marketing Company in Chennai

Digital Marketing Services in Chennai

sindhu said...


It is really a great and useful piece of info. I’m glad that you shared this helpful info with us. Please keep us informed like this. Thank you for sharing.

CRO Agency in Chennai

emarks said...

Hi,

The clients can showcase their products and services by using such lists to reach out the Target Businesses or Consumers.

Business & Consumer Database

Abni said...


Thanks for the information. Hope devotes will be careful after reading this post.Regards


Hadoop

Tom Dcruze said...

Hi,

Good to see your articles,

here also i will share some similar kind of business services.
We take great care in ensuring your lists are as accurate and updated as possible. We ensure that our accuracy rates are above the industry standard. USA B2B Email Database

USA Hospitality Email Lists
USA Investors Email List
USA CMO Email Lists

Leslie Lim said...


I admired those who has able to create a blog as wonderful as this! You are truly a hard working person. Keep up the good work and keep on posting.

www.imarksweb.org

Suseela Susiee said...

Awesome article. It is so detailed and well formatted that i enjoyed reading it as well as get some new information too.





datawarehousing Training in Chennai


Base SAS training in Chennai

Khan POS said...

Self service data analytics software solution is the new talk of the table on corporate echelons these days.
TradeMeters point of sale software management is looking for such data analytics possibilities too.

Anonymous said...

Cara Menurunkan Trombosit Tinggi
Cara Menurunkan Leukosit Tinggi Pada Anak
Obat Penyakit Kulit Scabies Yang Tersedia Di Apotik

Abiya Carol said...

Data Mining and Predictive Analytics have promised a the earth, the moon and the sun for sometime now, in all channels we do business in. My personal point of view is that on the web they fall far short of even the most pessimistic promises.

hadoop training in Chennai

$10every10seconds said...

http://TheInviteClub.com/?ref=65661

$10every10seconds said...

http://TheInviteClub.com/?ref=65661

πŸ’―πŸ’ΈπŸ’―πŸ’ΈπŸ’ΈπŸ’ΈπŸ’―

Jane said...

Great example. Simple but self-explanatory and educative at the same time. I'm sure you'd like to take a look at this data analysis for survey research.
Thanks for your time!

Samartha Sri said...

wow really superb you had posted one nice information through this. Definitely it will be useful for many people. So please keep update like this.

Recruitment Consultancy in Bangalore

Jessica George said...

It's interesting that many of the bloggers your tips helped to clarify a few things for me as well as giving.. very specific nice content. And tell people specific ways to live their lives.Sometimes you just have to yell at people and give them a good shake to get your point across.

SEO Company in India

Akshaysri said...

That was a great message in my carrier, and It's wonderful commands like mind relaxes with understand words of knowledge by information's.
Best CAT Coaching in Chennai