文档介绍:2020/3/putationandDataGeneralization2020/3/12Whatisconceptdescription?parisonofthedatathesimplestkindofdescriptivedataminingsometimescalledclassdescriptionwhentheconcepttobedescribedreferstoaclassofobjectsCharacterization:parison(discrimination):paringtwoormorecollectionsofdata2020/3/13DatageneralizationbothcharacterizationanddiscriminationarebasedondatageneralizationandsummarizationDatageneralizationaprocesswhichabstractsalargesetoftask-relevantdatainadatabasefromarelativelylowconceptualleveltohigherconceptuallevelsDatageneralizationapproaches:datacubeapproachattribute-orientedinductionapproach2020/3/14DatacubeapproachThedataforanalysisarestoredinamultidimensionaldatabase,ordatacubegeneralizationandspecializationcanbeperformedonadatacubebyroll-upanddrill-downthisisnotanapproachforconceptdescription,onlyfordatageneralizationLimitations:hetypesofdimensionstosimplenonnumericdataandofmeasurestosimpleaggregatednumericvaluesconcepthierarchiescanbeautomaticallygeneratedfromnumericdatatoformnumericdimensions,however,mercialsystemscannottellwhichdimensionsshouldbeusedandwhatlevelsshouldthegeneralizationreach2020/3/15IsOLAPenough?OLAPrestrictedtocertainkindsofattributesandmeasuretypesuser-plexdatatypesoftheattributesandtheiraggregationsamoreautomatedprocess2020/3/16Attribute-,,,KDDWorkshopatIJCAI-89initsinitialproposal,AOIisarelationaldatabasequery-oriented,generalization-based,putationcanalsobeusedcanbeusedforbothcharacterizationanddiscriminationgeneralidea:collectthetask-relevantdataperformgeneralizationbyattributeremovalorattributegeneralizationapplyaggregationbymergingidentical,umulatingtheirrespectivecountsinteractivepresentationwithusers2020/3/17SketchofAOIDatafocusingthespecificationoftask-relevantdata,whoseresultistheinitialrelationDatageneralizationattributeremovalifthereisalargesetofdistinctvaluesforanattribute,buteither(1)thereisnogeneralizationoperatorontheattribute,or(2)itshigherlevelconceptsareexpressedinterm