Research Institute for Biomedical Aging Research, Universität Innsbruck, 6020, Innsbruck, Austria.
European Translational Oncology Prevention and Screening (EUTOPS) Institute, Universität Innsbruck, Milser Str. 10, 6060, Hall in Tirol, Austria.
Clin Epigenetics. 2024 Sep 18;16(1):131. doi: 10.1186/s13148-024-01739-2.
The Illumina Methylation array platform has facilitated countless epigenetic studies on DNA methylation (DNAme) in health and disease, yet relatively few studies have so studied its reliability, i.e., the consistency of repeated measures. Here we investigate the reliability of both type I and type II Infinium probes. We propose a method for excluding unreliable probes based on dynamic thresholds for mean intensity (MI) and 'unreliability', estimated by probe-level simulation of the influence of technical noise on methylation β values using the background intensities of negative control probes. We validate our method in several datasets, including newly generated Illumina MethylationEPIC BeadChip v1.0 data from paired whole blood samples taken six weeks apart and technical replicates spanning multiple sample types. Our analysis revealed that specifically probes with low MI exhibit higher β value variability between repeated samples. MI was associated with the number of C-bases in the respective probe sequence and correlated negatively with unreliability scores. The unreliability scores were substantiated through validation in a new EPIC v1.0 (blood and cervix) and a publicly available 450k (blood) dataset, as they effectively captured the variability observed in β values between technical replicates. Finally, despite promising higher robustness, the newer version v2.0 of the MethylationEPIC BeadChip retained a substantial number of probes with poor unreliability scores. To enhance current pre-processing pipelines, we developed an R package to calculate MI and unreliability scores and provide guidance on establishing optimal dynamic score thresholds for a given dataset.
Illumina 甲基化阵列平台极大地促进了健康和疾病领域中 DNA 甲基化(DNAme)的无数项表观遗传学研究,但相对较少的研究关注其可靠性,即重复测量的一致性。在这里,我们研究了 I 型和 II 型 Infinium 探针的可靠性。我们提出了一种基于平均强度(MI)和“不可靠性”的动态阈值来排除不可靠探针的方法,“不可靠性”是通过使用阴性对照探针的背景强度模拟技术噪声对甲基化β值的影响来估计探针级别的。我们在多个数据集(包括相隔六周采集的配对全血样本的新生成的 Illumina MethylationEPIC BeadChip v1.0 数据以及跨越多个样本类型的技术重复)中验证了我们的方法。我们的分析表明,特别是低 MI 的探针在重复样本之间表现出更高的β值变异性。MI 与各自探针序列中的 C 碱基数量相关,并与不可靠性得分呈负相关。不可靠性得分在新的 EPIC v1.0(血液和宫颈)和公开可用的 450k(血液)数据集得到了验证,因为它们有效地捕获了技术重复之间观察到的β值变异性。最后,尽管新版本 v2.0 的 MethylationEPIC BeadChip 具有更高的稳健性,但仍保留了大量不可靠性得分较差的探针。为了增强当前的预处理管道,我们开发了一个 R 包来计算 MI 和不可靠性得分,并为给定数据集建立最佳动态得分阈值提供指导。