Wednesday, December 27, 2006

Two Book Recommendations

In my data mining courses, there are two books I always recommend to course attendees who are new to data mining. The first is Data Preparation for Data Mining by Dorian Pyle. I like this book because data preparation is usually the most time-consuming step in the data mining process, and there is only one book I know of that is written entirely for the purpose of data preparation (the second hit in the amazon list I linked is a data prep for SAS book, but that one is SAS-specific).

The second book I recommend is for the analyst who is not a statistician is Data Mining: Practical Machine Learning Tools and Techniques by Witten and Frank. They do a great job of describing algorithms and techniques in data mining in an intuitive way; there are few equations and derivations to cloud the issues for non-mathematicians. The biggest critique I have is that there is no description of neural networks, one of the key algorithms in data mining software packages. But that doesn't dampen my enthusiasm for the book. (If you would like a good, free description of neurla networks, go to the SAS Neural Network FAQ.)


