Suppr超能文献

CoMeBack:共甲基化区域的 DNA 甲基化阵列数据分析。

CoMeBack: DNA methylation array data analysis for co-methylated regions.

机构信息

Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC V5T 4S6, Canada.

Department of Finance, Beedie School of Business, Simon Fraser University, Burnaby, BC V5A 1S6, Canada.

出版信息

Bioinformatics. 2020 May 1;36(9):2675-2683. doi: 10.1093/bioinformatics/btaa049.

Abstract

MOTIVATION

High-dimensional DNA methylation (DNAm) array coverage, while sparse in the context of the entire DNA methylome, still constitutes a very large number of CpG probes. The ensuing multiple-test corrections affect the statistical power to detect associations, likely contributing to prevalent limited reproducibility. Array probes measuring proximal CpG sites often have correlated levels of DNAm that may not only be biologically meaningful but also imply statistical dependence and redundancy. New methods that account for such correlations between adjacent probes may enable improved specificity, discovery and interpretation of statistical associations in DNAm array data.

RESULTS

We developed a method named Co-Methylation with genomic CpG Background (CoMeBack) that estimates DNA co-methylation, defined as proximal CpG probes with correlated DNAm across individuals. CoMeBack outputs co-methylated regions (CMRs), spanning sets of array probes constructed based on all genomic CpG sites, including those not measured on the array, and without any phenotypic variable inputs. This approach can reduce the multiple-test correction burden, while enhancing the discovery and specificity of statistical associations. We constructed and validated CMRs in whole blood, using publicly available Illumina Infinium 450 K array data from over 5000 individuals. These CMRs were enriched for enhancer chromatin states, and binding site motifs for several transcription factors involved in blood physiology. We illustrated how CMR-based epigenome-wide association studies can improve discovery and reduce false positives for associations with chronological age.

AVAILABILITY AND IMPLEMENTATION

https://bitbucket.org/flopflip/comeback.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

高维 DNA 甲基化(DNAm)阵列覆盖率,虽然在整个 DNA 甲基组学中是稀疏的,但仍然构成了大量的 CpG 探针。随之而来的多重检验校正会影响检测关联的统计功效,这可能导致普遍存在的有限可重复性。测量近端 CpG 位点的阵列探针通常具有相关的 DNAm 水平,这些水平不仅具有生物学意义,而且还意味着统计上的相关性和冗余性。考虑到相邻探针之间的这种相关性的新方法,可能能够提高 DNAm 阵列数据中统计关联的特异性、发现和解释。

结果

我们开发了一种名为 Co-Methylation with genomic CpG Background(CoMeBack)的方法,该方法估计 DNA 共甲基化,定义为个体之间具有相关 DNAm 的近端 CpG 探针。CoMeBack 输出共甲基化区域(CMRs),跨越基于所有基因组 CpG 位点构建的阵列探针集,包括那些未在阵列上测量的位点,并且没有任何表型变量输入。这种方法可以减轻多重检验校正的负担,同时增强统计关联的发现和特异性。我们使用来自 5000 多个个体的公开可用的 Illumina Infinium 450K 阵列数据,在全血中构建和验证了 CMRs。这些 CMRs富含增强子染色质状态,以及参与血液生理学的几个转录因子的结合位点基序。我们说明了基于 CMR 的全基因组关联研究如何改善发现并减少与年龄相关的关联的假阳性。

可用性和实现

https://bitbucket.org/flopflip/comeback。

补充信息

补充数据可在生物信息学在线获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验