文档介绍:数据挖掘(基于认知的复杂数据对象的知识发现技术)
张德政
联系方式:
bigbank@
zdzchina@
——62334547
Cognition Based Knowledge Discovery in Database (DM(KDD) plex Data Object)
2 知识发现的基本概念
数据、信息、知识
DM(KDD)定义
DM(KDD)对象
DM(KDD)功能
DM(KDD)技术方法
数据、信息、知识
事实(facts):人类思想和社会活动的客观映射。
数据(data):事实的数字化、编码化和序列化。
信息(information):数据在信息媒介上的映射。
知识(knowledge):对信息的加工、吸收、提取、评价的结果。
We often see data as a string of bits, or numbers and symbols, or “objects” which we collect daily.
Information is data reduced to the minimum necessary to characterize the data.
Knowledge is integrated information, including facts and their relations, which have been perceived, discovered, or learned as our “mental pictures”.
数据、信息、知识的关系
MIS
DSS
MIS: Management
Information System
DSS: Decision Support
Systems
Value
...
10, M, 0, 10, 10, 0, 0, 0, SUBACUTE, 37, 2, 1, 0,15,-,-, 6000, 2, 0, abnormal, abnormal,-, 2852, 2148, 712, 97, 49, F,-,multiple,,2137, negative, n, n, ABSCESS,VIRUS
12, M, 0, 5, 5, 0, 0, 0, ACUTE, , 2, 1, 0,15, -,-, 10700,4,0,normal, abnormal, +, 1080, 680, 400, 71, 59, F,-,ABPC+CZX,, 70, negative, n, n, n, BACTERIA, BACTERIA
15, M, 0, 3, 2, 3, 0, 0, ACUTE, , 3, 1, 0,15, -, -, 6000, 0,0, normal, abnormal, +, 1124, 622, 502, 47, 63, F, -,FMOX+AMK, , 48, negative, n, n, n, BACTE(E), BACTERIA
16, M, 0, 32, 32, 0, 0, 0, SUBACUTE, 38, 2, 0, 0, 15, -, +, 12600, 4, 0,abnormal, abnormal, +, 41, 39, 2, 44, 57, F, -, ABPC+CZX, ?, ? ,negative, ?, n, n, ABSCESS, VIRUS
...
Medical Data by Dr. Tsumoto, Tokyo Med. & Dent. Univ., 38 attributes
Numerical attribute categorical attribute missing values class labels
IF cell_poly <= 220 AND Risk = n THEN Prediction = VIRUS [87,5%]
[confidence, predictive accuracy]
数据与知识(规则)
中医临床数据——结构化数据采集
中医临床数据——非结构化数据采集
中医临床数据
中医临床数据——全文数据库
中医临床数据——结构化数据库