Antoniou Pavlos, Iliopoulos Costas S, Mouchard Laurent, Pissis Solon P
University of Cyprus, Department of Computer Science, Nicosia, Cyprus.
Int J Comput Biol Drug Des. 2009;2(4):385-97. doi: 10.1504/IJCBDD.2009.030768. Epub 2009 Jan 4.
Novel high-throughput (Deep) sequencing technologies have redefined the way genome sequencing is performed. They are able to produce millions of short sequences in a single experiment and with a much lower cost than previous methods. In this paper, we address the problem of efficiently mapping and classifying millions of short sequences to a reference genome, based on whether they occur exactly once in the genome or not, and by taking into consideration probability scores. In particular, we design algorithms for Massive Exact and Approximate Pattern Matching of short degenerate and weighted sequences, derived from Deep sequencing technologies, to a reference genome.
新型高通量(深度)测序技术重新定义了基因组测序的执行方式。它们能够在一次实验中产生数百万条短序列,且成本比以前的方法低得多。在本文中,我们基于数百万条短序列在基因组中是否恰好出现一次,并考虑概率得分,解决了将这些短序列高效映射到参考基因组并进行分类的问题。特别是,我们设计了算法,用于将源自深度测序技术的短简并加权序列与参考基因组进行大规模精确和近似模式匹配。