Wednesday, December 28, 2011

Models Behaving Badly

I just read a fascinating book review in the Wall Street Journal Physics Envy: Models Behaving Badly. The author of the book, Emanuel Derman (former head of Quantitative Analsis at Goldman Sachs) argues that the financial models involved human beings and therefore were inherently brittle: as human behavior changed, the models failed. "in physics you're playing against God, and He doesn't change His laws very often. In finance, you're playing against God's creatures."

I'll agree with Derman that whenever human beings are in the loop, data suffers. People change their minds based on information not available to the models.

I also agree that human behavioral modeling is not the same as physical modeling. We can use the latter to provide motivation and even mathematics for human behavioral modeling, but we should not take this too far. A simple example is this: purchase decisions sometimes depend not on the person's propensity to purchase alone, but also on whether or not they had an argument that morning, or if they just watched a great movie. There is an emotional component that data cannot reflect. People therefore behave in ways that on the surface are contradictory, seemingly "random", which is way response rates of 1% can be "good".

However, I bristle a bit at the the emphasis on the physics analogy. In closed systems, models can explain everything. But once one opens up the world, even physical models are imperfect because they often do not incorporate all the information available. For example, missile guidance is based on pure physics: move a surface on a wing and one can change the trajectory of the missile. There are equations of motion that describe exactly where the missile will go. There is no mystery here.

However, all operational missile guidances systems are "closed loop"; the guidance command sequence is not completely scheduled but is updated throughout the flight. Why? To compensate for unexpected effects of the guidance commands, often due to ballistic winds, thermal gradients, or other effects on the physical system. It is the closed-loop corrections that make missile guidance work. The exact same principal applies to your car's cruise control, chasing down a fly ball in baseball, or even just walking down the street.

For a predictive model to be useful long-term, it needs updating to correct for changes in the population the models are applied to, whether the models be for customer acquisition, churn, fraud detection, or any model. The "closed-loop" typical in data mining is called "model updating" and is critical for long-term modeling success.

The question then becomes this: can the models be updated quickly enough to compensate for changes in the population? If a missile can only be updated at 10Hz (10x / sec.) but uncertainties effect the trajectory significantly in milliseconds, the closed-loop actions may be insufficient to compensate. If your predictive can only be updated monthly, but your customer behavior changes significantly on a weekly basis, your models will be behind perpetually. Measuring the effectiveness of model predictions is therefore critical in determining the frequency of model updating necessary in your organization.

To be fair, until I read the book I have no quibble with the arguments. The arguments here are based solely on the book review and some ideas they prompted in my mind. I'd welcome comments from anyone who has read the book already.

The book can be found on amazon here.

UPDATE: Aaron Lai wrote an article for CFA Magazine on the same topic, also quoting Derman. I commend the article to all (note: this is a PDF file download).

2 comments:

Hed said...

Your missile example misses the point. Missile guidance is very complex. There are many examples that are more simple. Physical models for pendulums, roller coasters, and electromagnets are not closed looped. Your example of customer acquisition would fall on this spectrum far beyond missile guidance. There are too few customers model this on a continuum. A truly accurate model predicting customer responses would have to take into account the neurons in the brains of the individual customers. And yet, such models are designed and used by financial analysts with limited knowledge of mathematics, statistics, or neuroscience. The resulting models for human behavior are a simulacrum of the physical models they emulate. Bottom line: missile guidance systems, as complex as they are, often hit the target. The model you describe for customer behavior, no matter how frequently it is updated, would not come close.

Dean Abbott said...

Hed: thanks for your comment. There is a lot to respond to, so I'll make this a rather lengthy reply. I hope it is clarifying (even if we still disagree!)

Yes, missile guidance is complex. My point is that with missile guidance there is a well-defined set of physical equations to describe behavior (the equations of motion). yet there is still uncertainty even for models based on physics (ballistic winds as one example).

In the past, customer behavioral modeling has been based on the result of their behavior rather than the cause. You are right that we don't know the causal part, which leads to much of the uncertainty here. But there is also information that is more reliable, while not causal, including how much disposable income a person has (which is why we include income, home value, etc.), have they purchased this kind of thing in the past, and are they interested in the kinds of products being sold?

What I was trying to tie in here was that this level of uncertainty about what customer will do is very much like and more significant than the ballistic winds in a misslie guidance problem. Any attempt to use physical models directly to model customer behavior should therefore be a closed-loop model because if there is uncertainly in missile guidance, there is certainly *more* uncertainty in customer behavioral models. So I think in the end we really agree on the core principle (though you clearly disagree with my analogy!)

Accuracy is of course greater for missile guidance (precisely because uncertainties are much smaller than in behavioral modeling), but how accurate is useful certainly depends on the application. A behavioral model can increase response rates from 1% to 2% and be wildly successful, whereas a missile could never be deploy with such abysmal accuracy. But updating models for behavioral modeling, which means (1) run model (2) apply scores (3) contact customers selected by model (4) observe results, giving us new target variable values, then (5) rebuild models, will undoubtedly improve accuracy if customer behavior changes from what was seen in the past. The missile guidance analogy is this: closed-loop missile guidance will improve accuracy if the environment changes from what was expected when the guidance laws were developed.