1 / 47
文档名称:

Data Mining Tutorial.pdf

格式:pdf   页数:47
下载后只包含 1 个 PDF 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

Data Mining Tutorial.pdf

上传人:bolee65 2014/10/1 文件大小:0 KB

下载得到文件列表

Data Mining Tutorial.pdf

文档介绍

文档介绍:An Introduction to Data Mining
Kurt Thearling, .

1
Outline
— Overview of data mining
— What is data mining?
— Predictive models and data scoring
— Real-world issues
— Gentle discussion of the core algorithms and processes
— Commercial data mining software applications
— Who are the players?
— Review the leading data mining applications
— Presentation & Understanding
— Data visualization: More than eye candy
— Build trust in analytic results
2
1
Resources
— Good overview book:
— Data Mining Techniques by Michael Berry
and Gordon Linoff
— Web:
— My web site (mended books, useful links, white papers, …)
>
— Knowledge Discovery Nuggets
>
— DataMine Mailing List
— majordomo@
— send message “subscribe datamine-l”
3
A Problem...
— You are a marketing manager for a pany
— Problem: Churn is too high
> Turnover (after six month introductory period ends) is 40%
— Customers receive incentives (average cost: $160)
when account is opened
— Giving new incentives to everyone who might leave is very
expensive (as well as wasteful)
— Bringing back a customer after they leave is both difficult and costly
4
2
… A Solution
— One month before the end of the introductory period is
over, predict which customers will leave
— If you want to keep a customer that is predicted to churn, offer
them something based on their predicted value
> The ones that are not predicted to churn need no attention
— If you don’t want to keep the customer, do nothing
— How can you predict future behavior?
— Tarot Cards
— Magic 8 Ball
5
The Big Picture
— Lots of hype & misinformation about data mining out there
— Data mining is part of a much larger process
— 10% of 10% of 10% of 10%
— Accuracy not always the most important measure of data mining
— The data itself is critical
— Algorithms aren’t as important as some people think
— If you can’t understand the patterns discovered with data
mining,