Choi Kwok Pui, Zeng Fanfan, Zhang Louxin
Department of Mathematics, National University of Singapore, Singapore 117543.
Bioinformatics. 2004 May 1;20(7):1053-9. doi: 10.1093/bioinformatics/bth037. Epub 2004 Feb 5.
Filtration is an important technique used to speed up local alignment as exemplified in the BLAST programs. Recently, Ma et al. discovered that better filtering can be achieved by spacing out the matching positions according to a certain pattern, instead of contiguous positions to trigger a local alignment in their PatternHunter program. Such a match pattern is called a spaced seed.
Our numerical computation shows that the ranks of spaced seeds (based on sensitivity) change with the sequences similarity. Since homologous sequences may have diverse similarity, we assess the sensitivity of spaced seeds over a range of similarity levels and present a list of good spaced seeds for facilitating homology search in DNA genomic sequences. We validate that the listed spaced seeds are indeed more sensitive using three arbitrarily chosen pairs of DNA genomic sequences.
过滤是一种用于加速局部比对的重要技术,如BLAST程序中所示。最近,马等人发现,通过按照特定模式隔开匹配位置,而不是像他们的PatternHunter程序那样使用连续位置来触发局部比对,可以实现更好的过滤。这种匹配模式称为间隔种子。
我们的数值计算表明,间隔种子的排名(基于灵敏度)会随着序列相似度而变化。由于同源序列可能具有不同的相似度,我们评估了间隔种子在一系列相似度水平上的灵敏度,并给出了一份良好间隔种子的列表,以促进DNA基因组序列中的同源性搜索。我们使用三对任意选择的DNA基因组序列验证了所列间隔种子确实更敏感。