Suppr超能文献

使用超深度全基因组测序数据进行变异调用准确性的实证评估。

Empirical evaluation of variant calling accuracy using ultra-deep whole-genome sequencing data.

机构信息

Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, 565-0871, Japan.

Department of Otorhinolaryngology - Head and Neck Surgery, Osaka University Graduate School of Medicine, Osaka, 565-0871, Japan.

出版信息

Sci Rep. 2019 Feb 11;9(1):1784. doi: 10.1038/s41598-018-38346-0.

Abstract

In the design of whole-genome sequencing (WGS) studies, sequencing depth is a crucial parameter to define variant calling accuracy and study cost, with no standard recommendations having been established. We empirically evaluated the variant calling accuracy of the WGS pipeline using ultra-deep WGS data (approximately 410×). We randomly sampled sequence reads and constructed a series of simulation WGS datasets with a variety of gradual depths (n = 54; from 0.05× to 410×). Next, we evaluated the genotype concordances of the WGS data with those in the SNP microarray data or the WGS data using all the sequence reads. In addition, we assessed the accuracy of HLA allele genotyping using the WGS data with multiple software tools (PHLAT, HLA-VBseq, HLA-HD, and SNP2HLA). The WGS data with higher depths showed higher concordance rates, and >13.7× depth achieved as high as >99% of concordance. Comparisons with the WGS data using all the sequence reads showed that SNVs achieved >95% of concordance at 17.6× depth, whereas indels showed only 60% concordance. For the accuracy of HLA allele genotyping using the WGS data, 13.7× depth showed sufficient accuracy while performance heterogeneity among the software tools was observed (the highest concordance of 96.9% was observed with HLA-HD). Improvement in HLA genotyping accuracy by further increasing the depths was limited. These results suggest a medium degree of the WGS depth setting (approximately 15×) to achieve both accurate SNV calling and cost-effectiveness, whereas relatively higher depths are required for accurate indel calling.

摘要

在全基因组测序(WGS)研究的设计中,测序深度是定义变异调用准确性和研究成本的关键参数,但尚未建立标准建议。我们通过使用超深度 WGS 数据(约 410×)对 WGS 管道的变异调用准确性进行了实证评估。我们随机抽样序列读取,并构建了一系列具有不同逐渐深度的模拟 WGS 数据集(n=54;从 0.05×到 410×)。接下来,我们评估了 WGS 数据与 SNP 微阵列数据或使用所有序列读取的 WGS 数据的基因型一致性。此外,我们使用多种软件工具(PHLAT、HLA-VBseq、HLA-HD 和 SNP2HLA)评估了 WGS 数据 HLA 等位基因分型的准确性。具有较高深度的 WGS 数据显示出较高的一致性率,而>13.7×的深度达到了>99%的一致性。与使用所有序列读取的 WGS 数据进行比较表明,SNV 在 17.6×的深度下达到了>95%的一致性,而插入缺失仅达到 60%的一致性。对于使用 WGS 数据进行 HLA 等位基因分型的准确性,13.7×的深度显示出足够的准确性,同时观察到软件工具之间的性能异质性(使用 HLA-HD 观察到最高的 96.9%一致性)。通过进一步增加深度来提高 HLA 基因分型准确性的效果有限。这些结果表明,中等程度的 WGS 深度设置(约 15×)可以实现准确的 SNV 调用和成本效益,而对于准确的插入缺失调用,则需要相对较高的深度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a342/6370902/7319a238aaa8/41598_2018_38346_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验