Department of Plant Sciences, University of California, Davis, CA 95616, USA.
Bioinformatics. 2009 Oct 1;25(19):2609-10. doi: 10.1093/bioinformatics/btp477. Epub 2009 Aug 10.
The Pine Alignment and SNP Identification Pipeline (PineSAP) provides a high-throughput solution to single nucleotide polymorphism (SNP) prediction using multiple sequence alignments from re-sequencing data. This pipeline integrates a hybrid of customized scripting, existing utilities and machine learning in order to increase the speed and accuracy of SNP calls. The implementation of this pipeline results in significantly improved multiple sequence alignments and SNP identifications when compared with existing solutions. The use of machine learning in the SNP identifications extends the pipeline's application to any eukaryotic species where full genome sequence information is unavailable.
All code used for this pipeline is freely available at the Dendrome project website (http://dendrome.ucdavis.edu/adept2/resequencing.html)
Pine 对齐和 SNP 识别管道 (PineSAP) 提供了一种使用重测序数据的多序列比对来预测单核苷酸多态性 (SNP) 的高通量解决方案。该管道集成了自定义脚本、现有实用程序和机器学习的混合,以提高 SNP 调用的速度和准确性。与现有解决方案相比,该管道的实现显著提高了多序列比对和 SNP 识别的速度。在 SNP 识别中使用机器学习扩展了该管道的应用范围,可用于任何没有完整基因组序列信息的真核生物物种。
此管道使用的所有代码都可在 Dendrome 项目网站上免费获得(http://dendrome.ucdavis.edu/adept2/resequencing.html)