Thursday, January 11, 2007

Will the term "Data Mining" survive?

I used to argue that data mining as a field will survive because it was tied so much to the bottom line--CFOs and stakeholders were involved with data mining applications and therefore the field would avoid the hype that crippled neural networks, AI and prior pattern recognition-like technologies. These achieved buzzword status that unfortunately surpassed successful practical applications.

However, it appears that the term data mining is being tied more and more to the process of data collection from multiple sources (and the subsequent analysis of that data), such as here and here and here. I try to argue with critics that the real problem is not with the algorithms, but with the combining of the data sets to begin with. Once the data is joined, whether you use data mining, OLAP, or just simple Excel reports, there is a possible privacy concern. Data mining per se has little to do with this; it only can be used to describe what data is there.

However, the balance may be tipping. Data mining (whether related to government programs or internet cookies) has become the term associated with all that is bad about combining personal information sources so that its days I think are numbered. Maybe it's time to move on to the next term or phrase, and then the next phrase, and so on, and so on, and so on...

3 comments:

  1. Data mining might have a geeky connotation associated with it, and might be associated with the bad things people sometimes do with data. I've found that one can be more effective by re-branding the term as something that is culturally appropriate within the organization one works for. Same work --- different name for it.

    ReplyDelete
  2. I fully agree with both: data mining can have a geeky connotation and that the label should be used that best communicates what is done with data mining.

    Economists often associate data mining with "data dredging", and statisticians often view data mining as nothing more than a fishing expedition by analysts who don't know better! If the results are emphasized rather than the label, most customers I work with are happier.

    ReplyDelete
  3. It seems that the evil/oppressive connotations of data mining are localised to America. The popular notion that "data mining == intrusive surveillance" has yet to take hold in other parts of the Anglosphere.

    I'd suggest that outside of the US, the phrase - where it is understood at all - is just as likely to be applied to scientific and commercial applications eg SETI or drug discovery.

    ReplyDelete