Monday, April 25, 2016

Tracking Model Performance Over Time


Most introductory data mining texts include substantial coverage of model testing. Various methods of assessing true model performance (holdout testing, k-fold cross validation, etc.) are usually explained, perhaps with some important variants, such as stratification of the testing samples.

Generally, all of this exposition is aimed at in-time analysis: Model development data may span multiple time periods, but the testing is more or less blind to this: all periods are treated as fair game and mixed together. This is fine for model development. Once predictive models are deployed, however, it is desirable to continue testing to track model performance over time. Models which degrade over time need to be adjusted or replaced.

Subtleties of Testing Over Time

Nearly all production model evaluation is performed with new out-of-time data. As new periods of observed outcomes become available, they are used to calculate running performance measures. As far it goes, focusing on the actual performance metric makes sense. In my experience, though, some clients become distracted by movement in the independent variables or in the predicted or actual outcome distributions, in isolation. It is important to understand the dynamic of these changes to fully understand model performance over time.

For the sake of a thought experiment, consider a very simply problem with one independent variable, and one target variable, both real numbers. Historically, the distribution of each of these variables has been confined to specific ranges. A predictive model has been constructed as a linear regression which attempts to anticipate the target variable, using only the input of the single independent variable (and a constant). Assume that errors observed in the development data have been small and otherwise unremarkable (they are distributed normally, their magnitude is relatively constant across the range of the independent variable, there is no obvious pattern to them and so forth).

Once this model is deployed, it is executed on all future cases drawn from the relevant statistical universe, and predictions are saved for further analysis. Likewise, actual outcomes are recorded as they become available. At the conclusion of each future time period, model performance within that period is examined.

Consider the simplest change to well-developed model: the distribution of the independent variable remains the same, but the actual outcomes begin to depart the regression line. Any number of changes could be taking place in the output distribution, but the predicted distribution (the regression line) cannot move since it is entirely defined by the independent variable, which in this case is stable. By definition, model performance is degrading. This circumstance is easy to diagnose: the dynamic linking the target and independent variables is changing, hence a new model is necessary to restore performance.

What happens, though, when the independent variable begins to migrate? There are two possible effects (in reality, some combination of these extremes is likely): 1. The distribution of actual outcomes will either shift to appropriately match the change ("the dots march along the regression line"), or 2. The distribution of actual outcomes does not shift to match the change. In the first case, the model continues to correctly identify the relationship between the target and the independent variable, and model performance will more-or-less endure. In the second case, reality begins to wander from the model and performance deteriorates. Notice that, in the second case, the actual outcome distribution may or may not change noticeably- either way, the model no longer correctly anticipates reality and needs to be updated.


The example used here was deliberately chosen to be simple, for illustrations' sake. Qualitatively, though, the same basic behaviors are exhibited by much more complex models. Models featuring multiple independent variables or employing complex transformations (neural networks, decision trees, etc.) obey the same fundamental dynamic. Given the sensitivity of nonlinear models to each of their independent variables, a migration in even one of them may provoke the changes described above. Consideration of the components of this interplay in isolation only serves to confuse: Changes over time can only be understood as part of the larger whole.


Anonymous said...

I think it makes perfect sense. Model training should be a on-going progress in a rapidly-changing world. Just like how humans learn, we learn new things everyday and that's how we progress. What's it gonna be like if we decide to learn all the things we need for maybe 22 years, and once we graduate from college, we refuse to learn anything new and always make decisions based on the things we learned in the first 22 years? We'll probably fail in life miserably!
However, it's a non-trivial task for models to continue learning over time. Training excessively could make the model overfit and increase its bias. I think a good approach could be that once we detect performance drop and change of variables in the real world, we could create a new model which takes the old model's parameters and learned attributes into account.

vijay rathi said...

nice article..its amazing...If you Are looking Best Digital Marketing Company in jaipur,
SEO Company in jaipur,
SEO services in jaipur,
website designer in jaipur

Coepd BA Trainings said...

We at Coepd declared Data Science Internship Programs (Self sponsored) for professionals who want to have hands on experience. We are providing this program in alliance with IT Companies in COEPD Hyderabad premises. This program is dedicated to our unwavering participants predominantly acknowledging and appreciating the fact that they are on the path of making a career in Data Science discipline. This internship is designed to ensure that in addition to gaining the requisite theoretical knowledge, the readers gain sufficient hands-on practice and practical know-how to master the nitty-gritty of the Data Science profession. More than a training institute, COEPD today stands differentiated as a mission to help you "Build your dream career" - COEPD way.

Coepd said...

We at COEPD provides finest Data Science and R-Language courses in Hyderabad. Your search to learn Data Science ends here at COEPD. Here, we are an established training institute who have trained more than 10,000 participants in all streams. We will help you to convert your passion to learn into an enriched learning process. We will accelerate your career in data science by mastering concepts of Data Management, Statistics, Machine Learning and Big Data.

Unknown said...

It has been simply incredibly generous with you to provide openly what exactly many individuals would’ve marketed for an eBook to end up making some cash for their end, primarily given that you could have tried it in the event you wanted.
fire and safety courses in chennai

ibss said...

Great article with excellent idea!Thank you for such a valuable article
Web design company in chennai
Web development company in chennai

stella said...

I was very interested in the article , it’s quite inspiring I should admit. I like visiting your site since I always come across interesting articles like this one. Keep sharing! Regards. Read more about Advanced Analytics

pakescorts646 said...

We are Provide you well-mannered and delightful Islamabad Escorts Females Who Belong to the Upper Strata Society they are trained to pleasure a man the exact way he wants her to without any Problem Best Escorts Services in Islamabad offer you to Spend a Quality time to get mentally relaxation Call us for any kind of information about our Services.

Ruby said...

I was very interested in the article, it’s quite inspiring I should admit. I like visiting your site since I always come across interesting articles like this one. Keep sharing! Regards. Read more about Big data Services

Dharani M said...

Good information thank you sharing this information
data science training in Marathahalli

best data science courses in Marathahalli

data science institute in Marathahalli

data science certification Marathahalli

data analytics training in Marathahalli

data science training institute in Marathahalli

asha said...

Nice Post...... Thanks for sharing this post
data science training in bangalore

best data science courses in bangalore

data science institute in bangalore

data science certification bangalore

data analytics training in bangalore

data science training institute in bangalore

mounika said...

Nice post..

data science training in BTM

best data science courses in BTM

data science institute in BTM

data science certification BTM

data analytics training in BTM

data science training institute in BTM

Unknown said...

myTectra the Market Leader in Machine Learning Training in Bangalore
myTectra offers Machine Learning Training in Bangalore using Class Room. myTectra offers Live Online Machine Learning Training Globally. Read More

Unknown said...

Let me help you find the best Digital Transformation Software .

jenifer irene said...

It was really an interesting blog, Thank you for providing unknown facts.
Aviation Courses in Chennai
Air Hostess Training Institute in Chennai
airport courses in Chennai
airport ground staff training courses in Chennai
medical coding course in Chennai
fashion technology courses in Chennai
Interior design courses in Chennai

NettechIndia said...

this blog provided a helpful information.I hope that you will post more updates like this.
python training in Mumbai

nivedhitha said...

very nice article Leading data science training in ameerpet

Pankaj Singh said...

Thank you so much for sharing this informative blog with us. Visit Ogen Infosystem for Website Designing Services.
Website Designing Company in Delhi

Mobile App Development Company said...

Thanks for sharing such a great blog... I am impressed with you taking time to post a nice info.
Mobile App Development Company

sachindigitalplanner said...

Rice Bags Manufacturers
Pouch Manufacturers
wall putty bag manufacturers

sachindigitalplanner said...

we have provide the best ppc service.
ppc services in gurgaon
website designing company in Gurgaon
PPC company in Noida
PPC Company in Delhi

sachindigitalplanner said...

we have provide the best fridge repair service.
fridge repair in faridabad
LG Fridge Repair in Faridabad
Videocon Fridge Repair in Faridabad
Whirlpool Fridge Repair in Faridabad
LG Refrigerator Repair In Faridabad
Washing Machine Repair Center in Noida

sachindigitalplanner said...

Bali Honeymoon Packages From Delhi
Bali Honeymoon Packages From Chennai
Hong Kong Packages From Delhi
Europe Packages from Delhi
Bali Honeymoon Packages From Bangalore
Bali Honeymoon Packages From Mumbai
Maldives Honeymoon Packages From Bangalore

nivedhitha said...

Great information Top data science institute in ameerpet

hair juice accelerator said...

power testo blast - The first consideration is why you have losing a persons body weight. A lot of people diet strategy strategy and get thinner for a wide range of different aspects, which are all valid. For example to become fitter and healthier, for a unique event such as a daughter's wedding, or for your current health and fitness and wellness such as to help management suffering from diabetes issues etc.

keto ultra said...

cerisea medica rWhen one's extra fat get flushed out of your system, you can find yourself facing circumstances of having diarrhoea soon after meals. Some items execute differently and cause constipation. There are some that describe themselves detoxification agents. This utilizes a belief that weight-loss occurs when facing excess bodyweight.

keto max shred said...

Beta keto About Milk products, Weight You want to look your best, but need a little help dropping bodyweight, so you want to get an awesome rapid weight-loss tactics that will help you to acquire your goal. These suggestions are all tested and tried and are the best for the goals that you have set for yourself.

keto max shred said...

Keto fast enough locks with which to cover the reduce position , all the better. If you are still uncertain then contact a trichologist and seek advice before doing anything. In 1999 I created an appointment to see my physician because I had discovered a couple of places at returning of my go right on the locks line,

energy biotics said...

reviva brain to progressively expand progressively. It also impacts the sufferer’s claws, giving them a rough, ridged or inadequate overall look. The actual cause is still unidentified, although current concepts include of an A relatively minimal auto-immune illness, pressure or advise a got basis. If the locks loss injury usually

alka tone keto said...

testo edge ex
guidance, particularly if you have injuries or other limitation, see a professional!* If you think ok starting up something on your own, let me put out a simple plug for one of my favorite online workout websites, Fitness For approachable, achievable and FUN workouts, this site is the best. Run by a

erectify ultra said...

maxx power libido children), then learn to separate yourself momentarily from the stressor. If you think like exploding from pressure all time, take a five-minute walk outside or get a quiet room and do something you like. Learning to temporarily separate yourself from an unavoidable stressor can do wonders. sometimes (e.g. looking after yourself! Step physiological

ropaxint said...

keto blast
he Mind-Body Connection in Body weight Loss The mind-body connection may sound like “magic and voodoo,” but it actually makes a lot of sense, and even health proper care science now emphasizes the need to harmonize the ideas and the whole personal whole body to boost one’s health and fitness and

shimla said...
This comment has been removed by a blog administrator.
creative web solution said...
This comment has been removed by a blog administrator.
creative web solution said...
This comment has been removed by a blog administrator.
creative web solution said...
This comment has been removed by a blog administrator.
creative web solution said...
This comment has been removed by a blog administrator.
creative web solution said...
This comment has been removed by a blog administrator.
creative web solution said...
This comment has been removed by a blog administrator.
creative web solution said...
This comment has been removed by a blog administrator.
creative web solution said...
This comment has been removed by a blog administrator.
creative web solution said...
This comment has been removed by a blog administrator.
patriot power greens said...
This comment has been removed by a blog administrator.