Laboratory of Cancer Epigenetics, Faculty of Medicine, Université Libre de Bruxelles (ULB), Brussels, Belgium.
Interuniversity Institute of Bioinformatics in Brussels (IB2), Université Libre de Bruxelles (ULB), Brussels, Belgium.
Epigenetics. 2022 Dec;17(13):2434-2454. doi: 10.1080/15592294.2022.2135201.
Illumina Infinium DNA Methylation (5mC) arrays are a popular technology for low-cost, high-throughput, genome-scale measurement of 5mC distribution, especially in cancer and other complex diseases. After the success of its HumanMethylation450 array (450k), Illumina released the MethylationEPIC array (850k) featuring increased coverage of enhancers. Despite the widespread use of 850k, analysis of the corresponding data remains suboptimal: it still relies mostly on Illumina's default annotation, which underestimates enhancerss and long noncoding RNAs. Results: We have thus developed an approach, based on the ENCODE and LNCipedia databases, which greatly improves upon Illumina's default annotation of enhancers and long noncoding transcripts. We compared the re-annotated 850k with both 450k and reduced-representation bisulphite sequencing (RRBS), another high-throughput 5mC profiling technology. We found 850k to cover at least three times as many enhancers and long noncoding RNAs as either 450k or RRBS. We further investigated the reproducibility of the three technologies, applying various normalization methods to the 850k data. Most of these methods reduced variability to a level below that of RRBS data. We then used 850k with our new annotation and normalization to profile 5mC changes in breast cancer biopsies. 850k highlighted aberrant enhancer methylation as the predominant feature, in agreement with previous reports. Our study provides an updated processing approach for 850k data, based on refined probe annotation and normalization, allowing for improved analysis of methylation at enhancers and long noncoding RNA genes. Our findings will help to further advance understanding of the DNA methylome in health and disease.
Illumina Infinium DNA 甲基化(5mC)芯片是一种用于低成本、高通量、全基因组测量 5mC 分布的流行技术,尤其在癌症和其他复杂疾病中。在其 HumanMethylation450 芯片(450k)成功之后,Illumina 发布了 MethylationEPIC 芯片(850k),其增强了增强子的覆盖范围。尽管 850k 被广泛使用,但对相应数据的分析仍然不理想:它仍然主要依赖于 Illumina 的默认注释,该注释低估了增强子和长非编码 RNA。结果:因此,我们基于 ENCODE 和 LNCipedia 数据库开发了一种方法,该方法极大地改进了 Illumina 对增强子和长非编码转录本的默认注释。我们将重新注释的 850k 与 450k 和降低代表性亚硫酸氢盐测序(RRBS)进行了比较,RRBS 是另一种高通量 5mC 分析技术。我们发现 850k 覆盖的增强子和长非编码 RNA 至少是 450k 或 RRBS 的三倍。我们进一步研究了这三种技术的可重复性,对 850k 数据应用了各种归一化方法。这些方法中的大多数将变异性降低到 RRBS 数据以下的水平。然后,我们使用带有新注释和归一化的 850k 来分析乳腺癌活检中的 5mC 变化。850k 突出了异常增强子甲基化作为主要特征,这与之前的报告一致。我们的研究提供了一种基于改进的探针注释和归一化的 850k 数据处理方法,允许对增强子和长非编码 RNA 基因的甲基化进行改进分析。我们的发现将有助于进一步推进对健康和疾病中 DNA 甲基组的理解。