基于高通量测序数据的基因型模型选择进行 SNP calling。

SNP calling using genotype model selection on high-throughput sequencing data.

机构信息

Department of Statistical Science, School of Mathematics and Computational Science, Sun Yat-Sen University, Guangzhou 510275, China.

出版信息

Bioinformatics. 2012 Mar 1;28(5):643-50. doi: 10.1093/bioinformatics/bts001. Epub 2012 Jan 16.

MOTIVATION

A review of the available single nucleotide polymorphism (SNP) calling procedures for Illumina high-throughput sequencing (HTS) platform data reveals that most rely mainly on base-calling and mapping qualities as sources of error when calling SNPs. Thus, errors not involved in base-calling or alignment, such as those in genomic sample preparation, are not accounted for.

RESULTS

A novel method of consensus and SNP calling, Genotype Model Selection (GeMS), is given which accounts for the errors that occur during the preparation of the genomic sample. Simulations and real data analyses indicate that GeMS has the best performance balance of sensitivity and positive predictive value among the tested SNP callers.

AVAILABILITY

The GeMS package can be downloaded from https://sites.google.com/a/bioinformatics.ucr.edu/xinping-cui/home/software or http://computationalbioenergy.org/software.html.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

动机

对现有的用于 Illumina 高通量测序 (HTS) 平台数据的单核苷酸多态性 (SNP) 调用程序的回顾表明，大多数程序主要依赖碱基调用和映射质量作为 SNP 调用的误差源。因此，未涉及碱基调用或比对的误差，例如基因组样本制备中的误差，未被考虑在内。

结果

给出了一种新的共识和 SNP 调用方法，基因型模型选择 (GeMS)，它考虑了基因组样本制备过程中发生的误差。模拟和真实数据分析表明，GeMS 在测试的 SNP 调用程序中具有最佳的灵敏度和阳性预测值的性能平衡。

可用性

GeMS 软件包可从 https://sites.google.com/a/bioinformatics.ucr.edu/xinping-cui/home/software 或 http://computationalbioenergy.org/software.html 下载。

补充信息

补充数据可在 Bioinformatics 在线获得。

SNP calling using genotype model selection on high-throughput sequencing data.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

SUPPLEMENTARY INFORMATION

动机

结果

可用性

补充信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献