1 / 17
文档名称:

数据挖掘4.ppt

格式:ppt   大小:751KB   页数:17页
下载后只包含 1 个 PPT 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

分享

预览

数据挖掘4.ppt

上传人:wz_198613 2018/7/28 文件大小:751 KB

下载得到文件列表

数据挖掘4.ppt

文档介绍

文档介绍:Chapter 4
Primitives for Data Mining
7/28/2018
1
Why data mining primitives?
Can we hope a data mining system autonomously mine out all of the valuable knowledge?
such a system may generate an overwhelmingly large set of patterns
most of the mined patterns may be irrelevant to the analysis task
many of the mined patterns, although related to the analysis task, may be difficult to understand, or lack of validity, novelty, or utility
A data mining query language that incorporates necessary primitives can help users flexibly interact with the data mining system
7/28/2018
2
What defines a data mining task?
What is the data set that you want to mine?
What kind of knowledge do you want to mine?
What background knowledge could be useful?
Which measurements can be used to estimate the interestingness of patterns?
How to present the discovered patterns?
to development or use a data mining query language,
you must know what defines a data mining task
7/28/2018
3
Task-relevant data
this is the database portion to be investigated
database or data warehouse name
database tables or data warehouse cube
conditions for data selection
relevant attributes or dimensions
data grouping criteria
data collection process results in a new data relation, called initial data relation
the initial data relation may or may not correspond to a physical relation in the database
the portion of database to be mined is called a minable view
7/28/2018
4
Types of knowledge to be mined
Characterization
Discrimination
Association
Classification/prediction
Clustering
Outlier analysis
……
in addition to specifying the type of knowledge to be mined, the user can be more specific and provide pattern templates that all discovered patterns must match
7/28/2018
5
Background knowledge (I)
Four types of concept hierarchies:
schema hierarchy
a total or partial order among attributes in the database schema
. city < province_or_state < country
set-grouping hierarchy
a total or pa