1 / 15
文档名称:

Analyzing Promoter Sequences with Multilayer Perceptrons.ppt

格式:ppt   页数:15
下载后只包含 1 个 PPT 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

Analyzing Promoter Sequences with Multilayer Perceptrons.ppt

上传人:中国课件站 2011/12/4 文件大小:0 KB

下载得到文件列表

Analyzing Promoter Sequences with Multilayer Perceptrons.ppt

文档介绍

文档介绍:Analyzing Promoter Sequences with Multilayer Perceptrons
Glenn Walker
ECE 539
Background (DNA)
Deoxyribonucleic acid (DNA) is a long molecule made up binations of four smaller molecules (base pairs): adenine (A), cytosine (C), guanine (G), thymine (T). These four molecules bined in an order unique to each anism. The order of the molecules contains the information to make all the parts necessary for anism to survive.
GATTAGAGATT
TCAGTTAACTCTGGCTAATCTCTAA
DNA is two-stranded plementary
Background (DNA)
Genes are sections of DNA that can contain from a few hundred
base-pairs to tens of thousands. Genes contain instructions on
how to make proteins -- molecules necessary for building and
anisms.
Three different genes on piece of DNA
“junk” DNA
Background
Promoters are sequences of DNA to which RNA polymerase can
bind and begin transcription of a gene. Transcription is the
process of making plementary copy of the DNA which is then
translated into a protein.
promoter
sequence
actual gene information
RNA polymerase binds here
and begins transcription
Problem
Knowing gene locations is desirable for medical reasons
One way to find genes is to look for promoter regions
How do we find promoter regions?
One Solution
Promoter regions are highly conserved -- different regions often contain similar patterns
We can train works to recognize promoter regions
We choose a multilayer perceptron
work Configuration
The multilayer perceptron (MLP) is a mon neural
network configuration
We used a MLP with 3 layers -- an input, output, and hidden
layer
Number of:
Inputs
Hidden
Output
1
115/58
4,8,16,
20,24,28,
32
work Configuration
Two ways of presenting input were tried -- one used 58 inputs and the other 115
Different numbers of hidden nodes were tried to find the optimally structured work
Only one output was used to indicate whether the input was a promoter sequence or not (1 or 0, respectively)
work Inputs
The inputs consisted of 106 sets of 57 bases of D