文档介绍:Data Mining and Applications in Genomics
Lecture Notes in Electrical Engineering
Volume 25
For other titles published in this series, go to
ies/7818
Sio-Iong Ao
Data Mining and
Applications in Genomics
Sio-Iong Ao
International Association of Engineers
Oxford University
UK
ISBN 978-1-4020-8974-9 e-ISBN 978-1-4020-8975-6
Library of Congress Control Number: 2008936565
© 2008 Springer Science + Business Media .
No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any
means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written
permission from the Publisher, with the exception of any material supplied specifically for the purpose
of being entered and executed on puter system, for exclusive use by the purchaser of the work.
Printed on acid-free paper
To my lovely mother Lei, Soi-Iong
Preface
With the results of many different genome-sequencing projects, hundreds of genomes
from all branches of species have e available. Currently, one important task is
to search for ways that can explain anization and function of each genome.
Data mining algorithms e very useful to extract the patterns from the data and
to present it in such a way that can better our understanding of the structure, relation,
and function of the subjects. The purpose of this book is to illustrate the data mining
algorithms and their applications in genomics, with frontier case studies based on the
recent and current works of the author and colleagues at the University of Hong Kong
and the Oxford puting Laboratory, University of Oxford.
It is estimated that there exist about 10 million single-nucleotide polymorphisms
(SNPs) in the human genome. plete screening of all the SNPs in a genomic
region es an expensive undertaking. In Chapter 4, it is illustrated how the
problem of selecting a subset of informative SNPs (tag SNPs) can be formulated as
a hierarchical clust