Suppr超能文献

优化过滤可降低短读测序检测基因组变异的错误率。

Optimized filtering reduces the error rate in detecting genomic variants by short-read sequencing.

机构信息

Vesalius Research Center, Vlaams Instituut voor Biotechnologie (VIB), Leuven, Belgium.

出版信息

Nat Biotechnol. 2011 Dec 18;30(1):61-8. doi: 10.1038/nbt.2053.

Abstract

Distinguishing single-nucleotide variants (SNVs) from errors in whole-genome sequences remains challenging. Here we describe a set of filters, together with a freely accessible software tool, that selectively reduce error rates and thereby facilitate variant detection in data from two short-read sequencing technologies, Complete Genomics and Illumina. By sequencing the nearly identical genomes from monozygotic twins and considering shared SNVs as 'true variants' and discordant SNVs as 'errors', we optimized thresholds for 12 individual filters and assessed which of the 1,048 filter combinations were effective in terms of sensitivity and specificity. Cumulative application of all effective filters reduced the error rate by 290-fold, facilitating the identification of genetic differences between monozygotic twins. We also applied an adapted, less stringent set of filters to reliably identify somatic mutations in a highly rearranged tumor and to identify variants in the NA19240 HapMap genome relative to a reference set of SNVs.

摘要

区分单核苷酸变异 (SNVs) 和全基因组序列中的错误仍然具有挑战性。在这里,我们描述了一组过滤器,以及一个免费的可用软件工具,该工具可选择性地降低错误率,从而促进来自两种短读长测序技术(Complete Genomics 和 Illumina)的数据中的变异检测。通过对同卵双胞胎的几乎相同的基因组进行测序,并将共享的 SNVs 视为“真正的变异”,将不一致的 SNVs 视为“错误”,我们针对 12 个单独的过滤器优化了阈值,并评估了 1,048 种过滤器组合中的哪些在灵敏度和特异性方面有效。所有有效过滤器的累积应用将错误率降低了 290 倍,有助于识别同卵双胞胎之间的遗传差异。我们还应用了一组经过改编的、不那么严格的过滤器,以可靠地识别高度重排肿瘤中的体细胞突变,并识别相对于参考 SNV 集的 NA19240 HapMap 基因组中的变体。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验