1 / 49
文档名称:

DB Seminar Series_ HARP_ A Hierarchical Algorithm with Automatic ....ppt

格式:ppt   页数:49
下载后只包含 1 个 PPT 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

DB Seminar Series_ HARP_ A Hierarchical Algorithm with Automatic ....ppt

上传人:中国课件站 2011/12/4 文件大小:0 KB

下载得到文件列表

DB Seminar Series_ HARP_ A Hierarchical Algorithm with Automatic ....ppt

文档介绍

文档介绍:DB Seminar Series: HARP: A Hierarchical Algorithm with Automatic Relevant Attribute Selection for Projected Clustering
Presented by:
Kevin Yip
20 September 2002
0
Short Summary
Our own work (unpublished), supervised by Dr. Cheung and Dr. Ng
Problem: to cluster datasets of very high dimensionality
Assumption: clusters are formed in subspaces
1
Short Summary
Previous approaches: either have special restrictions on the dataset or target clusters, or cannot determine the dimensionality of the clusters automatically
Our approach: not restricted by these limitations
2
Presentation Outline
Clustering
Projected clustering
Previous approaches to projected clustering
Our approach: HARP
Concepts
Implementation:
Experiments
Future work and conclusions
3
Clustering
Goal: given a dataset D with N records and d attributes (dimensions), partition the records into k disjoint clusters such that
Intra-cluster similarity is maximized
Inter-cluster similarity is minimized
4
Clustering
How to measure similarity?
Distance-based: Manhattan distance, Euclidean distance, etc.
Correlation-based: cosine correlation, Pearson correlation, etc.
Link-based (common neighbors)
Pattern-based
5
Clustering
mon types of clustering algorithms:
Partitional: selects some representative points for each cluster, assigns all other points to their closest clusters, and then re-determines the new representative points
Hierarchical (agglomerative): repeatedly determines the two most similar clusters, and merges them
6
Clustering
Partitional clustering:
Dataset
Representatives
Assignment
Replacement
7
Clustering
Hierarchical clustering:
Dataset
Similarity calculation
Best merge
determination
Merging
8
Projected Clustering
Assumption (general case): each cluster is formed in a subspace
Source of figures:
ORCLUS (SIGMOD 2000)
Assumption (special case): each cluster has a set of relevant attributes
Goal: determine the records and relevant attributes of each cluster (to “

最近更新

健康老年肌肤保养秘籍 29页

养老护理员专业技能实操 58页

2026年公司会议实施方案 13页

2026年公务员行测考试言语理解得分技巧 6页

2026年八年级英语电子版教案 57页

2026年八字喜土跟缺土的区别 3页

2023年三峡旅游职业技术学院单招职业倾向性测.. 41页

2023年上海中医药大学单招职业技能考试模拟测.. 40页

2023年上海电机学院单招职业倾向性测试题库新.. 40页

2023年上饶幼儿师范高等专科学校单招职业技能.. 41页

2023年临夏现代职业学院单招职业适应性考试题.. 42页

2026年全国十佳少年事迹材料 29页

2023年云南理工职业学院单招职业适应性考试题.. 39页

2023年云南锡业职业技术学院单招职业倾向性测.. 39页

2023年佳木斯职业学院单招职业技能测试题库带.. 40页

2023年信阳学院单招职业适应性考试题库含答案.. 40页

2023年六安职业技术学院单招职业倾向性考试题.. 39页

2023年兰州石化职业技术学院单招职业适应性考.. 39页

2023年兴安职业技术学院单招职业技能测试题库.. 39页

2023年内蒙古交通职业技术学院单招职业适应性.. 40页

2023年内蒙古科技职业学院单招职业技能考试模.. 40页

2023年北京科技大学天津学院单招职业技能测试.. 39页

2023年南京信息职业技术学院单招职业适应性考.. 42页

2023年南京视觉艺术职业学院单招职业技能考试.. 40页

2023年南开大学滨海学院单招职业技能测试模拟.. 40页

【人教版英语字帖】七年级下册单词表衡水体字.. 42页

国开《建筑力学》期末机考答案 15页

农村人才流失国外研究报告 2页

住院患者自带药品使用管理规定通知 3页

栏杆计算书 2页