下一代测序数据中胚系拷贝数变异的发现算法改进。

Algorithmic improvements for discovery of germline copy number variants in next-generation sequencing data.

机构信息

ARUP Institute for Clinical and Experimental Pathology, Salt Lake City, UT, USA.

出版信息

BMC Bioinformatics. 2022 Jul 19;23(1):285. doi: 10.1186/s12859-022-04820-w.

DOI:10.1186/s12859-022-04820-w

PMID:35854218

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9297596/

Abstract

BACKGROUND

Copy number variants (CNVs) play a significant role in human heredity and disease. However, sensitive and specific characterization of germline CNVs from NGS data has remained challenging, particularly for hybridization-capture data in which read counts are the primary source of copy number information.

RESULTS

We describe two algorithmic adaptations that improve CNV detection accuracy in a Hidden Markov Model (HMM) context. First, we present a method for computing target- and copy number-specific emission distributions. Second, we demonstrate that the Pointwise Maximum a posteriori (PMAP) HMM decoding procedure yields improved sensitivity for small CNV calls compared to the more common Viterbi HMM decoder. We develop a prototype implementation, called Cobalt, and compare it to other CNV detection tools using sets of simulated and previously detected CNVs with sizes spanning a single exon to a full chromosome.

CONCLUSIONS

In both the simulation and previously detected CNV studies Cobalt shows similar sensitivity but significantly fewer false positive detections compared to other callers. Overall sensitivity is 80-90% for deletion CNVs spanning 1-4 targets and 90-100% for larger deletion events, while sensitivity is somewhat lower for small duplication CNVs.

摘要

背景

拷贝数变异（CNVs）在人类遗传和疾病中起着重要作用。然而，从 NGS 数据中敏感而特异性地描述种系 CNVs 一直具有挑战性，特别是对于杂交捕获数据，其中读取计数是拷贝数信息的主要来源。

结果

我们描述了两种算法适应性调整，可在隐马尔可夫模型（HMM）上下文中提高 CNV 检测准确性。首先，我们提出了一种用于计算目标和拷贝数特异性发射分布的方法。其次，我们证明与更常见的维特比 HMM 解码器相比，点最大后验（PMAP）HMM 解码过程可提高小 CNV 调用的灵敏度。我们开发了一个名为 Cobalt 的原型实现，并使用跨越单个外显子到整个染色体的大小的模拟和先前检测到的 CNV 集与其他 CNV 检测工具进行比较。

结论

在模拟和先前检测到的 CNV 研究中，Cobalt 与其他调用者相比，显示出相似的灵敏度，但假阳性检测明显减少。对于跨越 1-4 个靶标的 1-4 个靶标的缺失 CNV，总体灵敏度为 80-90%，对于较大的缺失事件，灵敏度为 90-100%，而较小的重复 CNV 的灵敏度则稍低。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6545/9297596/5c4d535686a5/12859_2022_4820_Fig1_HTML.jpg

相似文献

Algorithmic improvements for discovery of germline copy number variants in next-generation sequencing data.

BMC Bioinformatics. 2022 Jul 19;23(1):285. doi: 10.1186/s12859-022-04820-w.

Assessing the reproducibility of exome copy number variations predictions.

Genome Med. 2016 Aug 8;8(1):82. doi: 10.1186/s13073-016-0336-6.

Evaluation of copy number variant detection from panel-based next-generation sequencing data.

Mol Genet Genomic Med. 2019 Jan;7(1):e00513. doi: 10.1002/mgg3.513. Epub 2018 Nov 22.

Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth.

Am J Hum Genet. 2012 Oct 5;91(4):597-607. doi: 10.1016/j.ajhg.2012.08.005.

Copy number variation detection using next generation sequencing read counts.

BMC Bioinformatics. 2014 Apr 14;15:109. doi: 10.1186/1471-2105-15-109.

Noise cancellation using total variation for copy number variation detection.

BMC Bioinformatics. 2018 Oct 22;19(Suppl 11):361. doi: 10.1186/s12859-018-2332-x.

On the core segmentation algorithms of copy number variation detection tools.

Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae022.

Genome-wide algorithm for detecting CNV associations with diseases.

BMC Bioinformatics. 2011 Aug 9;12:331. doi: 10.1186/1471-2105-12-331.

Improving Copy Number Variant Detection from Sequencing Data with a Combination of Programs and a Predictive Model.

J Mol Diagn. 2020 Jan;22(1):40-49. doi: 10.1016/j.jmoldx.2019.08.009. Epub 2019 Nov 13.

CNV-RF Is a Random Forest-Based Copy Number Variation Detection Method Using Next-Generation Sequencing.

J Mol Diagn. 2016 Nov;18(6):872-881. doi: 10.1016/j.jmoldx.2016.07.001. Epub 2016 Sep 3.

引用本文的文献

Detection of germline CNVs from gene panel data: benchmarking the state of the art.

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae645.

The Role of Genetic Testing in Adult CKD.

J Am Soc Nephrol. 2024 Aug 1;35(8):1107-1118. doi: 10.1681/ASN.0000000000000401. Epub 2024 May 6.

Case Report: Whole exome sequencing identifies compound heterozygous variants in the gene in a child with developmental delay.

Front Genet. 2024 Aug 9;15:1415194. doi: 10.3389/fgene.2024.1415194. eCollection 2024.

Association of Rare Recurrent Copy Number Variants With Congenital Heart Defects Based on Next-Generation Sequencing Data From Family Trios.

Front Genet. 2019 Sep 10;10:819. doi: 10.3389/fgene.2019.00819. eCollection 2019.

本文引用的文献

Evaluation of three read-depth based CNV detection tools using whole-exome sequencing data.

Mol Cytogenet. 2017 Aug 23;10:30. doi: 10.1186/s13039-017-0333-5. eCollection 2017.

nbCNV: a multi-constrained optimization model for discovering copy number variants in single-cell sequencing data.

BMC Bioinformatics. 2016 Sep 17;17:384. doi: 10.1186/s12859-016-1239-7.

Challenges in detecting genomic copy number aberrations using next-generation sequencing data and the eXome Hidden Markov Model: a clinical exome-first diagnostic approach.

Hum Genome Var. 2016 Aug 18;3:16025. doi: 10.1038/hgv.2016.25. eCollection 2016.

Assessing the reproducibility of exome copy number variations predictions.

Genome Med. 2016 Aug 8;8(1):82. doi: 10.1186/s13073-016-0336-6.

CoNVaDING: Single Exon Variation Detection in Targeted NGS Data.

Hum Mutat. 2016 May;37(5):457-64. doi: 10.1002/humu.22969. Epub 2016 Feb 24.

CLAMMS: a scalable algorithm for calling common and rare copy number variants from exome sequencing data.

Bioinformatics. 2016 Jan 1;32(1):133-5. doi: 10.1093/bioinformatics/btv547. Epub 2015 Sep 17.

Sambamba: fast processing of NGS alignment formats.

Bioinformatics. 2015 Jun 15;31(12):2032-4. doi: 10.1093/bioinformatics/btv098. Epub 2015 Feb 19.

CODEX: a normalization and copy number variation detection method for whole exome sequencing.

Nucleic Acids Res. 2015 Mar 31;43(6):e39. doi: 10.1093/nar/gku1363. Epub 2015 Jan 23.

An evaluation of copy number variation detection tools from whole-exome sequencing data.

Hum Mutat. 2014 Jul;35(7):899-907. doi: 10.1002/humu.22537. Epub 2014 May 1.

Detection of clinically relevant copy number variants with whole-exome sequencing.

Hum Mutat. 2013 Oct;34(10):1439-48. doi: 10.1002/humu.22387. Epub 2013 Aug 30.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

下一代测序数据中胚系拷贝数变异的发现算法改进。

Algorithmic improvements for discovery of germline copy number variants in next-generation sequencing data.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献