Department of Biology, University of Pennsylvania, 433 S, University Ave., Philadelphia, PA 19104, USA.
Genome Biol. 2011 Jun 20;12(6):R55. doi: 10.1186/gb-2011-12-6-r55.
SNP (single nucleotide polymorphism) discovery using next-generation sequencing data remains difficult primarily because of redundant genomic regions, such as interspersed repetitive elements and paralogous genes, present in all eukaryotic genomes. To address this problem, we developed Sniper, a novel multi-locus Bayesian probabilistic model and a computationally efficient algorithm that explicitly incorporates sequence reads that map to multiple genomic loci. Our model fully accounts for sequencing error, template bias, and multi-locus SNP combinations, maintaining high sensitivity and specificity under a broad range of conditions. An implementation of Sniper is freely available at http://kim.bio.upenn.edu/software/sniper.shtml.
SNP(单核苷酸多态性)的发现利用下一代测序数据仍然很困难,主要是因为所有真核生物基因组中都存在冗余的基因组区域,如散布的重复元件和基因的同源物。为了解决这个问题,我们开发了 Sniper,一种新的多基因座贝叶斯概率模型和一种计算效率高的算法,它明确地将映射到多个基因组基因座的序列读数纳入其中。我们的模型充分考虑了测序错误、模板偏倚和多基因座 SNP 组合,在广泛的条件下保持了高灵敏度和特异性。Sniper 的实现可在 http://kim.bio.upenn.edu/software/sniper.shtml 免费获得。