Wednesday, October 17, 2007

Statistics: Why Do So Many Hate It?

In Why is Statistics So Scary?, the Sep-26-2007 posting to the Math Stats And Data Mining Web log, the author wonders why so many people exhibit negative reactions to statistics.

I've had occasion to wondered about the same thing. I make my living largely from statistics, and have frequently received unfavorable reactions when I explain my work to others. Invariably, such respondents admit the great usefulness of statistics, so that is not the source of this negativity. I am certain that individual natural aptitude for this sort of work varies, but I do not believe that this accounts for the majority of negative feelings towards statistics.

Having received formal education in what I call "traditional" or "classical" statistics, and having since assisted others studying statistics in the same context, I suggest that one major impediment for many people is the total reliance by classical statisticians on a large set of very narrowly focused techniques. While they serve admirably in many situations, it is worth noting the disadvantages of classical statistical techniques:

1. Being so highly specialized, there are many of these techniques to remember.

2. It is also necessary to remember the appropriate applications of these techniques.

3. Broadly, classical statistics involves many assumptions. Violation of said assumptions may invalidate the results of these techniques.

Classical techniques were developed largely during a time without the benefit of rapid, inexpensive computation, which is very different from the environment we enjoy today.

The above were major motivations for me to embrace newer analytical methods (data mining, bootstrapping, etc.) in my professional life. Admittedly, newer methods have disadvantages of their own (not the least of which is their hunger for data), but it's been my experience that newer methods tend to be easier to understand, more broadly applicable and, consequently, simpler to apply.

I think the broader educational question is: Would students be better served by one or more years of torture, imperfectly or incorrectly learning myriad methods which will soon be forgotten, or the provision of a few widely useful tools and an elemental-level of understanding?


Anonymous said...

Here are a few more disadvantages I've come across when training others in statistics.

4. Statistical concepts or assumptions can be counter-intuitive or difficult to grasp.
5. Statistics books rarely give a broad overview or layman description of statistical technique before going into the detail.
6. How a statistic works can often be conveyed much better through visual interaction rather than a statistics book (e.g.
7. Many statistics books assume readers can follow the logic of algebraic transformations and do not provide fully worked examples that help students to understand the logic.
8. Some statisticians use technical jargon that makes their descriptions sound much more difficult than they really are (though all professions are guilty of this!). Throughout history many professions have kept their knowledge hard to access to protect their monopoly on the power of their knowledge (e.g Guilds, The Magic Circle etc..).
9. Not all software explains statistics or the diagnostics in the output very well (SPSS is quite good though).

I agree with your "few widely useful tools and an elemental-level of understanding" proposal. My ideal training resource would contain all statistics and data mining concepts with a basic description, underlying algebra, fully worked examples where possible and a visualisation. It would also provide basic and advanced information on each concept allowing you to learn or teach to the required depth.

I know this is a lot to ask! This is the closest on the web I've found to my ideal: but in practice I use a variety of resources to meet these needs.

Will Dwinnell said...

Let me clear about this: I am not suggesting "dumbing down" statistics education. My point it that I'd rather explain bootstrapping to a novice than the 17 problem-specific classical procedures it replaces.

Anonymous said...

Will, hope I didn't give the wrong idea.

I guess your main point is that data mining techniques can give us quick and robust answers without having to chose from many classical statistics for which it is highly desirable to understand the method and assumptions well?

I was offering further explanations as to why many hate statistics from my experience of teaching. Often there is a frustration with teaching materials that don't cover clear basic explanations all the way up to the advanced. I didn't want to suggest dumbing down - instead I would like statistics more accessible.

All the best.


Will Dwinnell said...

No, I didn't take your comments that way, but after re-reading my post, I decided to make clear what you mention: that I think it's preferable to move to more convenient tools, rather than "dumb down" statistics education.

Thanks very much for your contributions!

daniel john said...

Pretty good post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog posts. Any way I'll be subscribing to your feed and I hope you post again soon.

Custom Term Papers