文档介绍:该【Clustering in Concept Extraction在概念提取的聚类 】是由【核辐射】上传分享,文档一共【22】页,该文档可以免费在线阅读,需要了解更多关于【Clustering in Concept Extraction在概念提取的聚类 】的内容,可以使用淘豆网的站内搜索功能,选择自己适合的文档,以下文字是截取该文章内的部分文字,如需要获得完整电子版,请下载此文档到您的设备,方便您编辑和打印。ClusteringinConceptExtraction在概念提取的聚类
MainApproachinConceptExtraction
Problems
ClusteringMethodsandLSI
IdeasandOurWorks
ExperimentalResults
Outline
:
DiscreteMethods
Linearapproaches
PCA
K-Means
K-Medians
K-Centers
LSH
Non-Linearapproaches
KPCA
Embeddings
ArtificialIntelligenceBasedApproaches
ClusteringMethodsandLSI
PCAisabbreviationforPrincipleComponentAnalysisandisacollectionofmethodsthatuseeigenvectorandeigenvaluepropertiesforclustering.
So,SVDisoneofthemainapproachesinPCAcollection.
Recently,provedthatK-MeansandothermembersofitsfamilycanbelistedinPCAfamily.
PCAfamilyarelinearapproachesandcannotclusterdatathattheirindependenceisnonlinear.
PCAfamilyissuitableforGaussianDistribution.
ClusteringMethodsandLSI
ClusteringMethodsandLSI
Onesamplefornonlinearindependence.
ButK-MeanshascomputationalcomplexityequaltoO(nm),anditisbetterthanSVD(O(n3m)).
LSHisamemberoflinearmethodsandhasgoodcomputationalcomplexity.
ClusteringMethodsandLSI
KPCA(KernelPCA)isacollectionofmethodsinnonlinearclustering.
TherearetwogroupsinKPCA
Kernelfunctions:.
ClusteringMethodsandLSI
KernelTricks:inthisfamilyweshouldconvertoriginalspacetoahigherorderspacewithspecificproperties(somemethodsconvertdatatoaHilbertspacethatisasubsetforBanachspaces).
ArtificialIntelligencebasedclusteringareveryslowforourpurpose.
ClusteringMethodsandLSI
OurworkswillbeonbothfindinganappropriateKernelFunctionandanappropriateEmbedding.
ButwefocusonKernelFunctionsinthisphase.
Ourideaisalittledifferentwithmainapproach,wechangedistancefunctioninsteadofpointstoreachthelinearity.
Thereisatechniquecalled“Copula”-variatedistributionfunctionfortwoprobabilisticvariable.
IdeasandOurWorks
Mainideaisasbelow:.
Thereareawidevarietyofcopulafunctionforgeneralpurposesandhavebeenusedindifferentresearchesandtheydidreachtogoodresults.
IdeasandOurWorks