Suppr超能文献

一种用于检测人类基因组中拷贝数变异的新型信号处理方法。

A novel signal processing approach for the detection of copy number variations in the human genome.

机构信息

Department of Radiology, Harvard School of Public Health, Boston, MA 02115, USA.

出版信息

Bioinformatics. 2011 Sep 1;27(17):2338-45. doi: 10.1093/bioinformatics/btr402. Epub 2011 Jul 12.

Abstract

MOTIVATION

Human genomic variability occurs at different scales, from single nucleotide polymorphisms (SNPs) to large DNA segments. Copy number variations (CNVs) represent a significant part of our genetic heterogeneity and have also been associated with many diseases and disorders. Short, localized CNVs, which may play an important role in human disease, may be undetectable in noisy genomic data. Therefore, robust methodologies are needed for their detection. Furthermore, for meaningful identification of pathological CNVs, estimation of normal allelic aberrations is necessary.

RESULTS

We developed a signal processing-based methodology for sequence denoising followed by pattern matching, to increase SNR in genomic data and improve CNV detection. We applied this signal-decomposition-matched filtering (SDMF) methodology to 429 normal genomic sequences, and compared detected CNVs to those in the Database of Genomic Variants. SDMF successfully detected a significant number of previously identified CNVs with frequencies of occurrence ≥10%, as well as unreported short CNVs. Its performance was also compared to circular binary segmentation (CBS). through simulations. SDMF had a significantly lower false detection rate and was significantly faster than CBS, an important advantage for handling large datasets generated with high-resolution arrays. By focusing on improving SNR (instead of the robustness of the detection algorithm), SDMF is a very promising methodology for identifying CNVs at all genomic spatial scales.

AVAILABILITY

The data are available at http://tcga-data.nci.nih.gov/tcga/ The software and list of analyzed sequence IDs are available at http://www.hsph.harvard.edu/~betensky/ A Matlab code for Empirical Mode Decomposition may be found at: http://www.clear.rice.edu/elec301/Projects02/empiricalMode/code.html

CONTACT

caterina@mit.edu.

摘要

动机

人类基因组的变异性发生在不同的尺度上,从单核苷酸多态性 (SNP) 到大片段 DNA。拷贝数变异 (CNV) 代表了我们遗传异质性的重要组成部分,也与许多疾病和障碍有关。短的、局部的 CNV 可能在人类疾病中发挥重要作用,但在嘈杂的基因组数据中可能无法检测到。因此,需要稳健的方法来检测它们。此外,为了对病理性 CNV 进行有意义的识别,需要估计正常等位基因的异常。

结果

我们开发了一种基于信号处理的序列去噪方法,然后进行模式匹配,以提高基因组数据的信噪比,从而提高 CNV 检测的准确性。我们将这种信号分解匹配滤波 (SDMF) 方法应用于 429 个正常基因组序列,并将检测到的 CNV 与基因组变异数据库中的 CNV 进行比较。SDMF 成功地检测到了大量以前已确定的、出现频率≥10%的 CNV,以及未报告的短 CNV。它的性能也与循环二进制分割 (CBS) 进行了比较。通过模拟。SDMF 的假阳性率显著降低,并且比 CBS 快得多,这对于处理使用高分辨率阵列生成的大型数据集来说是一个重要的优势。通过专注于提高信噪比 (而不是检测算法的稳健性),SDMF 是一种非常有前途的方法,可以在所有基因组空间尺度上识别 CNV。

可用性

数据可在 http://tcga-data.nci.nih.gov/tcga/ 获得。软件和分析序列 ID 列表可在 http://www.hsph.harvard.edu/~betensky/ 获得。经验模态分解的 Matlab 代码可在以下网址找到:http://www.clear.rice.edu/elec301/Projects02/empiricalMode/code.html

联系方式

caterina@mit.edu.

相似文献

4
Modified screening and ranking algorithm for copy number variation detection.用于拷贝数变异检测的改进筛选与排序算法
Bioinformatics. 2015 May 1;31(9):1341-8. doi: 10.1093/bioinformatics/btu850. Epub 2014 Dec 25.

引用本文的文献

3
Preprocessing Sequence Coverage Data for More Precise Detection of Copy Number Variations.预处理序列覆盖数据以更精确地检测拷贝数变异。
IEEE/ACM Trans Comput Biol Bioinform. 2020 May-Jun;17(3):868-876. doi: 10.1109/TCBB.2018.2869738. Epub 2018 Sep 12.
10
Use of autocorrelation scanning in DNA copy number analysis.使用自相关扫描进行 DNA 拷贝数分析。
Bioinformatics. 2013 Nov 1;29(21):2678-82. doi: 10.1093/bioinformatics/btt479. Epub 2013 Sep 16.

本文引用的文献

1
Statistical issues in the analysis of DNA Copy Number Variations.DNA拷贝数变异分析中的统计学问题
Int J Comput Biol Drug Des. 2008;1(4):368-95. doi: 10.1504/IJCBDD.2008.022208.
2
Copy number variation in human health, disease, and evolution.人类健康、疾病与进化中的拷贝数变异
Annu Rev Genomics Hum Genet. 2009;10:451-81. doi: 10.1146/annurev.genom.9.081307.164217.
3
Smoothing waves in array CGH tumor profiles.平滑阵列比较基因组杂交肿瘤图谱中的波形。
Bioinformatics. 2009 May 1;25(9):1099-104. doi: 10.1093/bioinformatics/btp132. Epub 2009 Mar 10.
6
On the frequency of copy number variants.关于拷贝数变异的频率。
Bioinformatics. 2008 Oct 15;24(20):2350-5. doi: 10.1093/bioinformatics/btn421. Epub 2008 Aug 8.
7
The fine-scale and complex architecture of human copy-number variation.人类拷贝数变异的精细尺度与复杂结构。
Am J Hum Genet. 2008 Mar;82(3):685-95. doi: 10.1016/j.ajhg.2007.12.010. Epub 2008 Jan 24.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验