文档介绍:Incorporating Homology Using Multi-instance Kernel for Protein Subcelluar
Localization
Suyu Mei Wang Fei
Shanghai Key Laboratory of Intelligent Information Processing, School puter Science,
Fudan University, Shanghai, PR China,200433
meisuyureg@ ******@fudan.
Abstract putational modeling for protein subcelluar
Kernel method has witnessed many essful localization. Among diverse protein sequence features,
applications putational biology in recent years, amino position (AA) is the most frequently
used feature, bined with other information
and thus kernel design is a key step to define the
[1]. PseAA [2] encodes the pair-wise correlation of two
similarity between two protein sequences. This paper
amino acids at λ intervals using amino acid
aims at designing a kernel to derive more accurate
physiochemical properties. Window-based k-mer
similarity between two protein sequences by
histogram is often used to capture the contextual
incorporating homology. Here a homologous information of amino acid and the conserved motif
sequence is viewed as one evolutionary instance of the information, such as gapAA, di-AA, and motif kernel
target sequence and all homologous sequences [3] [4], etc. The dimensionality of k-mer feature space
constitute one homology bag. K-mer based spectrum 20n
kernel is used to define the similarity between