文档介绍:*
Chapter 5
Mining Association Rules in Large Databases
第五章
挖掘大型数据库中的关联规则
*
Chapter 5 : Mining Association Rules in Large Databases
Association rule mining
Mining single-dimensional Boolean association rules (单维布尔关联规则)from transactional databases(事务数据库)
Mining multilevel association rules(多层关联规则) from transactional databases(事务数据库)
Mining multidimensional association rules(多维关联规则) from relational databases and data warehouse
Summary
*
Association Rule
Given a set of transactions, find rules that will predict the occurrence of an item based on the occurrences of other items in the transaction
Market-Basket transactions
Example of Association Rules
{Milk} {Bread},{Beer} {Diaper},
Implication means co-occurrence, not causality!
*
Association Rule Mining
buys(x, “computer”) buys(x, “software”) [5%, 70%]
association rules are interesting if satisfying both a minimum support threshold and a minimum confidence threshold
strong association rules(强关联规则)
Rule form:
A B [support, confidence]
support(A B )=P(AB), confidence (A B )=P(B|A)
Customer
buys software
Customer
buys both
Customer
buys computer
*
Rule Measures: Support and Confidence
Find all the rules X & Y Z with minimum confidence and support
support, s, probability that a transaction contains {X ,Y ,Z}
confidence, c, conditional probability that a transaction having {X, Y} also contains Z
Let minimum support 50%, and minimum confidence 50%, we have
A C (50%, %)
C A (50%, 100%)
*
Association Rule Mined from Transactional Databases(事务数据库)
Given: I = {i1, i2,…,im} a set of items(项)
D (task-relevant data): a set of DB transactions(事务集合)
Each transaction T I
Each transaction is associated with an identifier TID
A, B : a set of items, A I ; B I
An association rule is an implication of the form AB (where AB=) which holds in the transaction set D with support and confidence
*
Definit