Saturday, April 21, 2007

Data Mining Methods Poll

Interesting results of the latest KDNuggets poll on data mining methods. Interestingly, Decision Trees won the competition, followed by Clustering and Regression.

A couple of observations...
1) ensembles (Bagging and Boosting) went up. The sample size is too small to make any inferences, but this will be interesting to track over time.
2) SVMs and Neural Networks are at about the same level, though SVM usage dropped from 2006. I do wonder if SVMs will surpass neural networks as the "complex way to model accurately", but the verdict is still out on this.

1 comment:

Will Dwinnell said...

Such polls, though very interesting, tend to measure the proportion of data miners who use this method or that. As an alternative, it would be interesting to explore :

-The proportion of fielded predictive models constructed using specific data mining techniques

-The proportion of money spent on fielded predictive models, by data mining technique

-The proportion of money (asset, revenue, profit, etc.) under the control or guidance of fielded predictive models, by data mining technique