Suppr超能文献

下一代测序数据中胚系拷贝数变异的发现算法改进。

Algorithmic improvements for discovery of germline copy number variants in next-generation sequencing data.

机构信息

ARUP Institute for Clinical and Experimental Pathology, Salt Lake City, UT, USA.

出版信息

BMC Bioinformatics. 2022 Jul 19;23(1):285. doi: 10.1186/s12859-022-04820-w.

Abstract

BACKGROUND

Copy number variants (CNVs) play a significant role in human heredity and disease. However, sensitive and specific characterization of germline CNVs from NGS data has remained challenging, particularly for hybridization-capture data in which read counts are the primary source of copy number information.

RESULTS

We describe two algorithmic adaptations that improve CNV detection accuracy in a Hidden Markov Model (HMM) context. First, we present a method for computing target- and copy number-specific emission distributions. Second, we demonstrate that the Pointwise Maximum a posteriori (PMAP) HMM decoding procedure yields improved sensitivity for small CNV calls compared to the more common Viterbi HMM decoder. We develop a prototype implementation, called Cobalt, and compare it to other CNV detection tools using sets of simulated and previously detected CNVs with sizes spanning a single exon to a full chromosome.

CONCLUSIONS

In both the simulation and previously detected CNV studies Cobalt shows similar sensitivity but significantly fewer false positive detections compared to other callers. Overall sensitivity is 80-90% for deletion CNVs spanning 1-4 targets and 90-100% for larger deletion events, while sensitivity is somewhat lower for small duplication CNVs.

摘要

背景

拷贝数变异(CNVs)在人类遗传和疾病中起着重要作用。然而,从 NGS 数据中敏感而特异性地描述种系 CNVs 一直具有挑战性,特别是对于杂交捕获数据,其中读取计数是拷贝数信息的主要来源。

结果

我们描述了两种算法适应性调整,可在隐马尔可夫模型(HMM)上下文中提高 CNV 检测准确性。首先,我们提出了一种用于计算目标和拷贝数特异性发射分布的方法。其次,我们证明与更常见的维特比 HMM 解码器相比,点最大后验(PMAP)HMM 解码过程可提高小 CNV 调用的灵敏度。我们开发了一个名为 Cobalt 的原型实现,并使用跨越单个外显子到整个染色体的大小的模拟和先前检测到的 CNV 集与其他 CNV 检测工具进行比较。

结论

在模拟和先前检测到的 CNV 研究中,Cobalt 与其他调用者相比,显示出相似的灵敏度,但假阳性检测明显减少。对于跨越 1-4 个靶标的 1-4 个靶标的缺失 CNV,总体灵敏度为 80-90%,对于较大的缺失事件,灵敏度为 90-100%,而较小的重复 CNV 的灵敏度则稍低。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6545/9297596/5c4d535686a5/12859_2022_4820_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验