文档介绍:该【Cluster Analysis of聚类分析 】是由【核辐射】上传分享,文档一共【123】页,该文档可以免费在线阅读,需要了解更多关于【Cluster Analysis of聚类分析 】的内容,可以使用淘豆网的站内搜索功能,选择自己适合的文档,以下文字是截取该文章内的部分文字,如需要获得完整电子版,请下载此文档到您的设备,方便您编辑和打印。ClusterAnalysisof聚类分析
Clustering:AnExampleExperiment
Researcherswereinterestedinstudyinggeneexpressionpatternsindevelopingsoybeanseeds.
Seedswereharvestedfromsoybeanplantsat25,30,40,45,and50daysafterflowering(daf).
OneRNAsamplewasobtainedforeachlevelofdaf.
8
AnExampleExperiment(continued)
Eachofthe5sampleswasmeasuredontwotwo-colorcDNAmicroarrayslidesusingaloopdesign.
Theentireprocesswerepeatedonasecondoccasiontoobtainatotaloftwoindependentbiologicalreplications.
9
25
30
40
45
50
25
30
40
45
50
Rep1
Rep2
DiagramIllustratingtheExperimentalDesign
10
Thedafmeansestimatedforeachgenefromamixedlinearmodelanalysisprovideausefulsummaryofthedataforclusteranalysis.
NormalizedDataforOneExampleGene
daf
daf
NormalizedLogSignal
EstimatedMeans+or–1SE
11
400genesexhibitedsignificantevidenceofdifferentialexpressionacrosstime(p-value<,FDR=%).Wewillfocusonclusteringtheirestimatedmeanprofiles.
NormalizedDataforOneExampleGene
daf
daf
NormalizedLogSignal
EstimatedMeans+or–1SE
12
Webuildclustersbasedonthemostsignificantgenesratherthanonallgenesbecause...
Muchofthevariationinexpressionisnoiseratherthanbiologicalsignal,andwewouldrathernotbuildclustersonthebasisofnoise.
Someclusteringalgorithmswillbecomecomputationallyexpensiveiftherearealargenumberofobjects(geneexpressionprofilesinthiscase)tocluster.
13
EstimatedMeanProfilesforTop36Genes
14
DissimilarityMeasures
Whenclusteringobjects,wetrytoputsimilarobjectsinthesameclusteranddissimilarobjectsindifferentclusters.
Wemustdefinewhatwemeanbydissimilar.
Therearemanychoices.
Letxandydenotemdimensionalobjects:
x=(x1,x2,...,xm)y=(y1,y2,...,ym)
.,estimatedmeansatm=5fivetimepoints
foragivengene.
15
ParallelCoordinatePlots
Scatterplot
x1
x2
ParallelCoordinatePlot
Coordinate
Value
16