文档介绍:GENOME ANNOTATION AND FUNCTIONAL GENOMICSThe protein sequence perspective
GENOME ANNOTATION
Two main levels:
STRUCTURAL ANNOTATION – Finding genes and other biologically relevant sites thus building up a model of genome as objects with specific locations
FUNCTIONAL ANNOTATION – Objects are used in database searches (and expts) aim is attributing biologically relevant information to whole sequence and individual objects
WHY PROTEIN RATHER THAN DNA?
Larger alphabet -more parisons
Protein sequences lower signal to noise ratio
Less redundancy and no frameshifts
Each aa has different properties like size, charge etc
Closer to biological function
3D structure of similar proteins may be known
Evolutionary relationships more evident
Availability of good, well annotated protein sequence and pattern databases
Large-scale genome analysis projects
Rate-limiting step is annotation
Whole genome availability provides context information
Main goal is to bridge gap between genotype and phenotype
Definitions of Annotation
Addition of as much reliable and up-to-date information as possible to describe a sequence
Identification, structural description, characterisation of putative protein products and other features in primary genomic sequence
Information attached to genomic coordinates with start and end point, can occur at different levels
Interpreting raw sequence data into useful biological information
ANNOTATION/FUNCTION CAN BE MAPPED TO DIFFERENT LEVELS:
ORGANISM -phenotypic function (morphology, physiology, behavior, environmental response), context NB
CELLULAR -metabolic pathway, signal cascades, cellular localization. Context dependent
MOLECULAR -binding sites, catalytic activity, PTM, 3D structure
DOMAIN
SINGLE RESIDUE
Annotation is the description of:
Function(s) of the protein
Post-translational modification(s)
Domains and sites
Secondary structure
Quaternary structure
Similarities to other proteins
Disease(s) associated with deficiencie(s) in t