Suppr超能文献

通过与基于芯片的基因分型和孟德尔遗传进行验证,评估使用单样本和多样本调用算法进行单核苷酸多态性(SNP)调用的情况。

Evaluation of SNP calling using single and multiple-sample calling algorithms by validation against array base genotyping and Mendelian inheritance.

作者信息

Kumar Pankaj, Al-Shafai Mashael, Al Muftah Wadha Ahmed, Chalhoub Nader, Elsaid Mahmoud F, Aleem Alice Abdel, Suhre Karsten

机构信息

Weill Cornell Medical College in Qatar, Education City, Doha, Qatar.

出版信息

BMC Res Notes. 2014 Oct 22;7:747. doi: 10.1186/1756-0500-7-747.

Abstract

BACKGROUND

With diminishing costs of next generation sequencing (NGS), whole genome analysis becomes a standard tool for identifying genetic causes of inherited diseases. Commercial NGS service providers in general not only provide raw genomic reads, but further deliver SNP calls to their clients. However, the question for the user arises whether to use the SNP data as is, or process the raw sequencing data further through more sophisticated SNP calling pipelines with more advanced algorithms.

RESULTS

Here we report a detailed comparison of SNPs called using the popular GATK multiple-sample calling protocol to SNPs delivered as part of a 40x whole genome sequencing project by Illumina Inc of 171 human genomes of Arab descent (108 unrelated Qatari genomes, 19 trios, and 2 families with rare diseases) and compare them to variants provided by the Illumina CASAVA pipeline. GATK multi-sample calling identifies more variants than the CASAVA pipeline. The additional variants from GATK are robust for Mendelian consistencies but weak in terms of statistical parameters such as TsTv ratio. However, these additional variants do not make a difference in detecting the causative variants in the studied phenotype.

CONCLUSION

Both pipelines, GATK multi-sample calling and Illumina CASAVA single sample calling, have highly similar performance in SNP calling at the level of putatively causative variants.

摘要

背景

随着下一代测序(NGS)成本的降低,全基因组分析成为鉴定遗传性疾病遗传病因的标准工具。一般来说,商业NGS服务提供商不仅提供原始基因组读数,还会向客户提供单核苷酸多态性(SNP)位点的检测结果。然而,用户面临的问题是,是直接使用SNP数据,还是通过使用更先进算法的更复杂SNP检测流程对原始测序数据进行进一步处理。

结果

在此,我们报告了使用流行的基因组分析工具包(GATK)多样本检测协议所检测到的SNP与作为Illumina公司40倍全基因组测序项目一部分所提供的SNP的详细比较,该项目针对171个阿拉伯裔人类基因组(108个无亲缘关系的卡塔尔基因组、19个三联体以及2个患有罕见疾病的家族),并将其与Illumina CASAVA流程所提供的变异进行比较。GATK多样本检测比CASAVA流程识别出更多变异。来自GATK的额外变异在孟德尔一致性方面表现稳健,但在诸如转换/颠换(TsTv)比率等统计参数方面表现较弱。然而,这些额外变异在检测所研究表型的致病变异方面并无差异。

结论

在推测的致病变异水平上,GATK多样本检测流程和Illumina CASAVA单样本检测流程在SNP检测方面具有高度相似的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验