Sunday, May 6, 2007

DM is a process!

CRISP-DM: The Six Phases of Data Mining
From this diagram, we can see that DM is a complex, iterative process and often costly. You can't justpurchase some data mining software, install it, sit back, and watch it solve all yourproblems. It's impossible! Data mining is not magic.
Without skilled human supervision, blind useof data mining software will only provide you with the wrong answer to the wrongquestion applied to the wrong type of data. The wrong analysis is worse than noanalysis, since it leads to policy recommendations that will probably turn out to beexpensive failures.

3 comments:

Will Dwinnell said...

I know some data miners who subscribe specifically to CRISP, and others who do not. For my part, I think it is more important that certain issues be addressed than any particular methodology (CRISP, etc.) be applied.

I certainly agree that data mining is a process. Clear definition of the goals of data mining, rigorous verification of its results and ongoing monitoring of model performance are critical. It is surprising how often these basic items are ignored in practice in industry.

Wang Ding said...

Thanks for your comments. As you said, Clear definition of the goals of data mining, rigorous verification of its results and ongoing monitoring of model performance are more critical and vital to the success of DM.

Sandro Saitta said...

I have just found your new blog. It is nice to have some fresh air in the data mining blog community. Some discussion about the question of "automated data mining" can be found on dataminingblog as well if you want opinions that are different from the one of your post.