Suppr超能文献

使用Illumina外显子组测序数据对SNP检测工具的性能比较——一项结合家系信息和样本匹配SNP阵列数据的评估

Performance comparison of SNP detection tools with illumina exome sequencing data--an assessment using both family pedigree information and sample-matched SNP array data.

作者信息

Yi Ming, Zhao Yongmei, Jia Li, He Mei, Kebebew Electron, Stephens Robert M

机构信息

Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA Current address: Cancer Research and Technology Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc. PO Box B, Frederick, MD, 21702.

Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, MD 21702, USA.

出版信息

Nucleic Acids Res. 2014 Jul;42(12):e101. doi: 10.1093/nar/gku392. Epub 2014 May 15.

Abstract

To apply exome-seq-derived variants in the clinical setting, there is an urgent need to identify the best variant caller(s) from a large collection of available options. We have used an Illumina exome-seq dataset as a benchmark, with two validation scenarios--family pedigree information and SNP array data for the same samples, permitting global high-throughput cross-validation, to evaluate the quality of SNP calls derived from several popular variant discovery tools from both the open-source and commercial communities using a set of designated quality metrics. To the best of our knowledge, this is the first large-scale performance comparison of exome-seq variant discovery tools using high-throughput validation with both Mendelian inheritance checking and SNP array data, which allows us to gain insights into the accuracy of SNP calling through such high-throughput validation in an unprecedented way, whereas the previously reported comparison studies have only assessed concordance of these tools without directly assessing the quality of the derived SNPs. More importantly, the main purpose of our study was to establish a reusable procedure that applies high-throughput validation to compare the quality of SNP discovery tools with a focus on exome-seq, which can be used to compare any forthcoming tool(s) of interest.

摘要

为了在临床环境中应用外显子组测序衍生的变异,迫切需要从大量可用选项中识别出最佳的变异检测工具。我们使用了一个Illumina外显子组测序数据集作为基准,采用两种验证方案——同一批样本的家系信息和SNP芯片数据,以实现全局高通量交叉验证,从而使用一组指定的质量指标来评估来自开源和商业社区的几种流行变异发现工具所得到的SNP检测质量。据我们所知,这是首次使用孟德尔遗传检查和SNP芯片数据进行高通量验证的外显子组测序变异发现工具的大规模性能比较,这使我们能够以前所未有的方式通过这种高通量验证深入了解SNP检测的准确性,而此前报道的比较研究仅评估了这些工具的一致性,并未直接评估所得到的SNP的质量。更重要的是,我们研究的主要目的是建立一个可重复使用的程序,该程序应用高通量验证来比较以外显子组测序为重点的SNP发现工具的质量,可用于比较任何即将出现的感兴趣的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910c/4081058/300afa708c2e/gku392fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验