Thursday, September 25, 2008

KDD 2008

It's hard to believe that KDD2008 was the first KDD I've attended in seven years. It was striking how much has changed in that time, and that was one of the primary reasons I attended this past year--to see for myself if the reports I've heard are true. Sure enough, they are.

These reports, primarily from colleagues in industry, were that KDD didn't have anything they could "take home and use". Many of these folks are analysts who are decidedly not academic, so I thought I had a sense for what they meant.

I found their reports hit the mark. Seven years ago I was able to find (1) significant numbers of industry personnel at the conference and (2) many talks that were accessible enough for non-academics to understand. This time around there were few industry practitioners I met who were not PhDs. That's not to say there weren't interesting talks. Two I didn't see in person, but read later were the Elkan paper on learning from positive and unlabelled examples and the Grossman paper on Data Clouds. Though-provoking both. The lunch talk by Trevor Hastie was very interesting in talking about regularization, but it was geared toward those who can digest his textbook (which is among the finest data mining / statistical learning texts out there).

Social networking was a key theme of the conference, and it was such a dominant force at the conference that it deserves a separate post.

Lastly, the decline in participation by the business community was nowhere more evident than in the vendors room--only a few data mining software vendors were there, which indicates to me that it isn't viewed as a place to increase sales: if I remember correctly, only Microsoft, Oracle, Statsoft, Salford Systems, and SAS were there. A quick look at the kdnuggets software survey shows who wasn't there.

So it seems that KDD has wandered from a business/academic mix to a more academic conference, which is, of course, the prerogative of the organizers. I'm still searching for a great conference for the data mining practitioner who has the level of understanding of data mining to read and absorb a book like the Witten/Frank machine learning book but desires a more practical approach to the subject.


Tim Manns said...

I'd definitely like to hear more of your thoughts about the social network analysis mentioned at KDD 2008.

I’ve been doing some basic work examining all our mobile customer base communications and I expect big things from it :)
See my blog for some info (I can’t discuss too much for competitive advantage).

I’ve just bought the two books you recommended. I’m hoping they have some good practical examples/discussion!


Tim Manns

Anonymous said...

I've never been to KDD however I was hoping to attend next year in Paris. Industry focus and practical techniques I can use in my data mining work are really what I'm interested in...

Anonymous said...

Conferences are inevitably taken over by academics. Their score functions are improved by going; they have to write papers to thrive. Industrial practitioners, on the other hand, often have to jump extra hurdles to present. ("What are you going to say that will erase some of our competitive advantage?")

I'd like to fight this tide. I'd very much like the next KDD (in Paris: to be more useful to industry. Both of us general chairs are from industry and signed on to the job with that in mind. But we could use your suggestions, Dean, and friends. What would help make it more valuable to you?
-John Elder (, or

Dean Abbott said...


Thanks for jumping in. I think as a result of this and other comments I have received, it's time to accumulate some information. A post to follow...