1 / 148
文档名称:

《数据仓库与数据挖掘》第8章.ppt

格式:ppt   页数:148页
下载后只包含 1 个 PPT 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

分享

预览

《数据仓库与数据挖掘》第8章.ppt

上传人:中国课件站 2011/9/6 文件大小:0 KB

下载得到文件列表

《数据仓库与数据挖掘》第8章.ppt

文档介绍

文档介绍:第6章: 关联规则挖掘
Association rule mining
Algorithms for scalable mining of (single-dimensional Boolean) association rules in transactional databases
Mining various kinds of association/correlation rules
Constraint-based association mining
Sequential pattern mining
Applications/extensions of frequent pattern mining
Summary
2017/11/10
1
Data Mining: Concepts and Techniques
What Is Association Mining?
Association rule mining:
Finding frequent patterns, associations, correlations, or causal structures among sets of items or objects in transaction databases, relational databases, and other information repositories.
Frequent pattern: pattern (set of items, sequence, etc.) that occurs frequently in a database [AIS93]
Motivation: finding regularities in data
What products were often purchased together? — Beer and diapers?!
What are the subsequent purchases after buying a PC?
What kinds of DNA are sensitive to this new drug?
Can we automatically classify web documents?
2017/11/10
2
Data Mining: Concepts and Techniques
关联规则挖掘的基本概念
购物篮分析-引发关联规则挖掘的例子
问题:“什么商品组或集合顾客多半会在一次购物中同时购买?”
购物篮分析:设全域为商店出售的商品的集合(即项目全集),一次购物购买(即事务)的商品为项目全集的子集,若每种商品用一个布尔变量表示该商品的有无,则每个购物篮可用一个布尔向量表示。通过对布尔向量的分析,得到反映商品频繁关联或同时购买的购买模式。这些模式可用关联规则描述。
〖例〗购买计算机与购买财务管理软件的关联规则可表示为:
computer financial_management_softwar
[support=2%,confidence=60%]
support为支持度,confidence为置信度。
该规则表示:在所分析的全部事务中,有2%的事务同时购买计算机和财务管理软件;在购买计算机的顾客中60%也购买财务管理软件。
2017/11/10
3
Data Mining: Concepts and Techniques
Why Is Frequent Pattern or Assoiciation Mining an Essential Task in Data Mining?
Foundation for many essential data mining tasks
Association, correlation, causality
Sequential patterns, temporal or cyclic association, partial periodicity, spatial and multimedia association
Associative classification, cluster analysis, iceberg cube, fascicles (semantic pression)
Broad applications
Basket data analysis, cross-marketing, catalog design, sale campaign analysis
Web log (click stream) analysis, DNA sequence analysis, etc.
2017/11/10