在检测拷贝数变异时是否应考虑基因组相关性结构？

Shall genomic correlation structure be considered in copy number variants detection?

机构信息

Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina (USC), Discovery 449, 915 Greene St, Columbia, SC 29208, USA.

Department of Epidemiology and Biostatistics, Arnold School of Public Health, USC, Discovery 449, 915 Greene St, Columbia, SC 29208, USA.

出版信息

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab215.

DOI:10.1093/bib/bbab215

PMID:34114005

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8768456/

Abstract

Copy number variation has been identified as a major source of genomic variation associated with disease susceptibility. With the advent of whole-exome sequencing (WES) technology, massive WES data have been generated, allowing for the identification of copy number variants (CNVs) in the protein-coding regions with direct functional interpretation. We have previously shown evidence of the genomic correlation structure in array data and developed a novel chromosomal breakpoint detection algorithm, LDcnv, which showed significantly improved detection power through integrating the correlation structure in a systematic modeling manner. However, it remains unexplored whether the genomic correlation exists in WES data and how such correlation structure integration can improve the CNV detection accuracy. In this study, we first explored the correlation structure of the WES data using the 1000 Genomes Project data. Both real raw read depth and median-normalized data showed strong evidence of the correlation structure. Motivated by this fact, we proposed a correlation-based method, CORRseq, as a novel release of the LDcnv algorithm in profiling WES data. The performance of CORRseq was evaluated in extensive simulation studies and real data analysis from the 1000 Genomes Project. CORRseq outperformed the existing methods in detecting medium and large CNVs. In conclusion, it would be more advantageous to model genomic correlation structure in detecting relatively long CNVs. This study provides great insights for methodology development of CNV detection with NGS data.

摘要

拷贝数变异已被确定为与疾病易感性相关的基因组变异的主要来源。随着外显子组测序（WES）技术的出现，产生了大量的 WES 数据，从而可以在具有直接功能解释的蛋白质编码区域中鉴定拷贝数变异（CNV）。我们之前已经证明了阵列数据中的基因组相关性结构的证据，并开发了一种新颖的染色体断点检测算法 LDcnv，该算法通过以系统建模的方式整合相关性结构，显著提高了检测能力。然而，WES 数据中是否存在基因组相关性以及这种相关性结构集成如何提高 CNV 检测准确性仍未得到探索。在这项研究中，我们首先使用 1000 基因组计划数据探索了 WES 数据的相关性结构。真实的原始读取深度和中位数归一化数据都强烈表明存在相关性结构。受此事实的启发，我们提出了一种基于相关性的方法 CORRseq，作为 LDcnv 算法在 WES 数据分析中的一种新方法。在广泛的模拟研究和来自 1000 基因组计划的真实数据分析中评估了 CORRseq 的性能。CORRseq 在检测中等和大型 CNV 方面优于现有方法。总之，在检测相对较长的 CNV 时，建模基因组相关性结构会更有利。这项研究为使用 NGS 数据进行 CNV 检测的方法学发展提供了重要的见解。

相似文献

Shall genomic correlation structure be considered in copy number variants detection?

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab215.

Integrating genomic correlation structure improves copy number variations detection.

Bioinformatics. 2021 Apr 20;37(3):312-317. doi: 10.1093/bioinformatics/btaa737.

Detection of copy number variations in epilepsy using exome data.

Clin Genet. 2018 Mar;93(3):577-587. doi: 10.1111/cge.13144. Epub 2018 Jan 25.

An evaluation of copy number variation detection tools for cancer using whole exome sequencing data.

BMC Bioinformatics. 2017 May 31;18(1):286. doi: 10.1186/s12859-017-1705-x.

Detection of clinically relevant copy number variants with whole-exome sequencing.

Hum Mutat. 2013 Oct;34(10):1439-48. doi: 10.1002/humu.22387. Epub 2013 Aug 30.

An evaluation of copy number variation detection tools from whole-exome sequencing data.

Hum Mutat. 2014 Jul;35(7):899-907. doi: 10.1002/humu.22537. Epub 2014 May 1.

Enhanced copy number variants detection from whole-exome sequencing data using EXCAVATOR2.

Nucleic Acids Res. 2016 Nov 16;44(20):e154. doi: 10.1093/nar/gkw695. Epub 2016 Aug 9.

Noise cancellation using total variation for copy number variation detection.

BMC Bioinformatics. 2018 Oct 22;19(Suppl 11):361. doi: 10.1186/s12859-018-2332-x.

An integrated approach for copy number variation discovery in parent-offspring trios.

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab230.

Increasing the diagnostic yield of exome sequencing by copy number variant analysis.

PLoS One. 2018 Dec 17;13(12):e0209185. doi: 10.1371/journal.pone.0209185. eCollection 2018.

本文引用的文献

Integrating genomic correlation structure improves copy number variations detection.

Bioinformatics. 2021 Apr 20;37(3):312-317. doi: 10.1093/bioinformatics/btaa737.

CONY: A Bayesian procedure for detecting copy number variations from sequencing read depths.

Sci Rep. 2020 Jun 26;10(1):10493. doi: 10.1038/s41598-020-64353-1.

SCOPE: A Normalization and Copy-Number Estimation Method for Single-Cell DNA Sequencing.

Cell Syst. 2020 May 20;10(5):445-452.e6. doi: 10.1016/j.cels.2020.03.005.

An accurate and powerful method for copy number variation detection.

Bioinformatics. 2019 Sep 1;35(17):2891-2898. doi: 10.1093/bioinformatics/bty1041.

Neurodevelopmental disease genes implicated by de novo mutation and copy number variation morbidity.

Nat Genet. 2019 Jan;51(1):106-116. doi: 10.1038/s41588-018-0288-4. Epub 2018 Dec 17.

CODEX2: full-spectrum copy number variation detection by high-throughput DNA sequencing.

Genome Biol. 2018 Nov 26;19(1):202. doi: 10.1186/s13059-018-1578-y.

PennCNV in whole-genome sequencing data.

BMC Bioinformatics. 2017 Oct 3;18(Suppl 11):383. doi: 10.1186/s12859-017-1802-x.

An evaluation of copy number variation detection tools for cancer using whole exome sequencing data.

BMC Bioinformatics. 2017 May 31;18(1):286. doi: 10.1186/s12859-017-1705-x.

modSaRa: a computationally efficient R package for CNV identification.

Bioinformatics. 2017 Aug 1;33(15):2384-2385. doi: 10.1093/bioinformatics/btx212.

THE SCREENING AND RANKING ALGORITHM FOR CHANGE-POINTS DETECTION IN MULTIPLE SAMPLES.

Ann Appl Stat. 2016 Dec;10(4):2102-2129. doi: 10.1214/16-AOAS966. Epub 2017 Jan 5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

在检测拷贝数变异时是否应考虑基因组相关性结构？

Shall genomic correlation structure be considered in copy number variants detection?

机构信息

Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina (USC), Discovery 449, 915 Greene St, Columbia, SC 29208, USA.

Department of Epidemiology and Biostatistics, Arnold School of Public Health, USC, Discovery 449, 915 Greene St, Columbia, SC 29208, USA.

出版信息

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab215.

DOI:10.1093/bib/bbab215

PMID:34114005

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8768456/

Abstract

摘要

在检测拷贝数变异时是否应考虑基因组相关性结构？

Shall genomic correlation structure be considered in copy number variants detection?

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

在检测拷贝数变异时是否应考虑基因组相关性结构？

Shall genomic correlation structure be considered in copy number variants detection?

机构信息

出版信息