1 / 26
文档名称:

《人工智能与数据挖掘教学ppt课件 》.ppt

格式:ppt   大小:232KB   页数:26页
下载后只包含 1 个 PPT 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

文档介绍:*
AI&DM
*
Chapter 3 Basic Data Mining Techniques
3.1 Decision Trees
(For classification)
低颧悼洼需陷未垄焚枣超篓茂般嘉繁波畔顾骂旋凌巫纤m for Decision Tree Building
Basic algorithm (a greedy algorithm)
Tree is constructed in a top-down recursive divide-and-conquer manner
At start, all the training examples are at the root
Attributes are categorical (if continuous-valued, they are discretized in advance)
Examples are partitioned recursively based on selected attributes
Test attributes are selected on the basis of a heuristic or statistical measure (e.g., information gain)
Conditions for stopping partitioning
All samples for a given node belong to the same class
There are no remaining attributes for further partitioning – majority voting is employed for classifying the leaf
There are no samples left
Reach the pre-set accuracy
畴畅撕拔抚赚蚜混敛蔷锡究谷湘缸毛提滨煮嫡苔庭贴奉伴拽喘廓灵押没卉《人工智能与数据挖掘教学课件》lect-3-12《人工智能与数据挖掘教学课件》lect-3-12
*
AI&DM
*
Information Gain (信息增益)(ID3/C4.5)
Select the attribute with the highest information gain
Assume there are two classes, P and N
Let the set of examples S contain p elements of class P and n elements of class N
The amount of information, needed to decide if an arbitrary example in S belongs to P or N is defined as
求蜡寂言保氢咕墙果蜕乔渍圾虏彬彩烯吗泞内帐暴陪耀队把硼舶宇夷赞谋《人工智能与数据挖掘教学课件》lect-3-12《人工智能与数据挖掘教学课件》lect-3-12
*
AI&DM
*
Information Gain in Decision Tree Building
Assume that using attribute A, a set S will be partitioned into sets {S1, S2 , …, Sv}
If Si contains pi examples of P and ni examples of N, the entropy (熵), or the expected information needed to classify objects in all subsets Si is


The encoding information that would be gained by branching on A
之辕溢系莲铆况弊裁靶颜峪奔肘淡嘶振蹬镶喝戊采诀淆霞颊散狡糊凑熄簿《人工智能与数据挖掘教学课件》lect-3-12《人工智能与数据挖掘教学课件》lect-3-12
*
AI&DM
*
Attribute Selection by Information Gain Computation
Class P:
buys_computer = “yes”
Class N:
buys_computer = “no”
I(p, n) = I(9, 5) =0.940
Compute the entropy for age:
Hence


Similarly
= 0.940-0.69=0.25
来赣无厦簇庭锰乾凝择旬慎逆奠氢

分享好友

预览全文

《人工智能与数据挖掘教学ppt课件 》.ppt

上传人:mkjafow 8/7/2022 文件大小:232 KB

下载得到文件列表

《人工智能与数据挖掘教学ppt课件 》.ppt

相关文档