Suppr超能文献

一种用于检测种系大片段缺失和插入的稳健基准

A robust benchmark for detection of germline large deletions and insertions.

机构信息

Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA.

National Human Genome Research Institute, National Institutes of Health, Rockville, MD, USA.

出版信息

Nat Biotechnol. 2020 Nov;38(11):1347-1355. doi: 10.1038/s41587-020-0538-8. Epub 2020 Jun 15.

Abstract

New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution and comprehensiveness. To help translate these methods to routine research and clinical practice, we developed a sequence-resolved benchmark set for identification of both false-negative and false-positive germline large insertions and deletions. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle Consortium integrated 19 sequence-resolved variant calling methods from diverse technologies. The final benchmark set contains 12,745 isolated, sequence-resolved insertion (7,281) and deletion (5,464) calls ≥50 base pairs (bp). The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.51 Gbp and 5,262 insertions and 4,095 deletions supported by ≥1 diploid assembly. We demonstrate that the benchmark set reliably identifies false negatives and false positives in high-quality SV callsets from short-, linked- and long-read sequencing and optical mapping.

摘要

新技术和分析方法使得基因组结构变异 (SV) 的检测精度、分辨率和全面性不断提高。为了帮助将这些方法转化为常规研究和临床实践,我们开发了一个用于识别种系大片段插入和缺失的假阴性和假阳性的序列解析基准集。为了在个人基因组计划三胞胎中广泛同意的儿子身上创建一个广泛可用的细胞和 DNA 的基准集,基因组瓶联盟整合了来自多种技术的 19 种序列解析变异调用方法。最终的基准集包含 12745 个独立的、序列解析的插入(7281 个)和删除(5464 个)调用,长度≥50 个碱基对(bp)。Tier1 基准区域的任何额外调用都被认为是潜在的假阳性,覆盖了 251 Gbp 和 5262 个插入和 4095 个删除,这些区域得到了至少一个二倍体组装的支持。我们证明了基准集可以可靠地识别短读、链接读和长读测序以及光学图谱中高质量 SV 调用集中的假阴性和假阳性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f213/8454654/dae608f7acaa/nihms-1589143-f0008.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验