Suppr超能文献

利用短读长和链接读进行结构变异的综合分析可产生高度特异性和敏感性的预测结果。

Integrative analysis of structural variations using short-reads and linked-reads yields highly specific and sensitive predictions.

机构信息

TRON-Translational Oncology at the University Medical Center of Johannes Gutenberg University Mainz gGmbH, Mainz, Germany.

University Medical Center of the Johannes Gutenberg University, Mainz, Germany.

出版信息

PLoS Comput Biol. 2020 Nov 23;16(11):e1008397. doi: 10.1371/journal.pcbi.1008397. eCollection 2020 Nov.

Abstract

Genetic diseases are driven by aberrations of the human genome. Identification of such aberrations including structural variations (SVs) is key to our understanding. Conventional short-reads whole genome sequencing (cWGS) can identify SVs to base-pair resolution, but utilizes only short-range information and suffers from high false discovery rate (FDR). Linked-reads sequencing (10XWGS) utilizes long-range information by linkage of short-reads originating from the same large DNA molecule. This can mitigate alignment-based artefacts especially in repetitive regions and should enable better prediction of SVs. However, an unbiased evaluation of this technology is not available. In this study, we performed a comprehensive analysis of different types and sizes of SVs predicted by both the technologies and validated with an independent PCR based approach. The SVs commonly identified by both the technologies were highly specific, while validation rate dropped for uncommon events. A particularly high FDR was observed for SVs only found by 10XWGS. To improve FDR and sensitivity, statistical models for both the technologies were trained. Using our approach, we characterized SVs from the MCF7 cell line and a primary breast cancer tumor with high precision. This approach improves SV prediction and can therefore help in understanding the underlying genetics in various diseases.

摘要

遗传性疾病是由人类基因组的异常引起的。鉴定这些异常,包括结构变异(SVs),是我们理解的关键。传统的短读长全基因组测序(cWGS)可以识别到碱基对分辨率的 SVs,但仅利用短程信息,且具有较高的假阳性率(FDR)。连接读取测序(10XWGS)通过源自同一大 DNA 分子的短读长的连接利用长程信息。这可以减轻基于比对的伪影,特别是在重复区域,并应能够更好地预测 SVs。然而,这种技术的无偏评估尚不可用。在这项研究中,我们对两种技术预测的不同类型和大小的 SVs 进行了全面分析,并通过独立的基于 PCR 的方法进行了验证。两种技术共同识别的 SVs 具有高度特异性,而罕见事件的验证率下降。仅通过 10XWGS 发现的 SVs 观察到特别高的 FDR。为了提高 FDR 和灵敏度,为两种技术都训练了统计模型。使用我们的方法,我们从 MCF7 细胞系和原发性乳腺癌肿瘤中高精度地描述了 SVs。这种方法提高了 SV 预测的准确性,因此有助于理解各种疾病中的潜在遗传学。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1978/7721175/44840b1ad2cb/pcbi.1008397.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验