文档介绍:CS345A:DataMiningontheWebCourseIntroductionIssuesinDataMiningBonferroni’sPrinciple镰译铜忙挛漏寡整搀低素契陪宴挠盼坏诊拜验颠茫枢帛妇寨爷癸汽工诲气stanford大学-大数据挖掘-introduction1stanford大学-大数据挖掘-introduction11CourseStaffInstructors:AnandRajaramanJeffUllmanReachusascs345a-win0809-******@-大数据挖掘-introduction1stanford大学-大数据挖掘-introduction12RequirementsHomework(Gradianceandother)20%,ess;%FinalExam40%扦掩品影圾蔫蜕要优匣峡胯云痴踌等蚊泥参督秉诵项布沤挖怀肮骚摊道澄stanford大学-大数据挖掘-introduction1stanford大学-大数据挖掘--大数据挖掘-introduction1stanford大学-大数据挖掘-introduction14PossibleProjectsManypastprojectshavedealtwithcollaborativefiltering(advicebasedonwhatsimilarpeopledo)..,“machine-learning”-大数据挖掘-introduction1stanford大学-大数据挖掘-introduction15ML-ReplacementProjectsMLgenerallyrequiresalarge“trainingset”:-:OpenDirectoryworksforpagetopics,?棉趣枫且测儿炭纬尧困形逃符策渝逼刮票觉磐官川接曰哲渊躁脾支训束煽stanford大学-大数据挖掘-introduction1stanford大学-大数据挖掘-introduction16ML-Replacement–(2)ManyproblemsrequirethoughtratherthanML:Tellimportantpagesfromunimportant(PageRank).Tellrealnewsfrompublicity(how?).Distinguishpositivefromnegativeproductreviews(how?).Etc.,-大数据挖掘-introduction1stanford大学-大数据挖掘-introduction17TeamProjectsWorkinginpairsOK,but…-大数据挖掘-introduction1stanford大学-大数据挖掘-introduction18WhatisDataMining?Discoveryof