Leroy Arthur, Teh Ai Ling, Dondelinger Frank, Alvarez Mauricio A, Wang Dennis
Department of Computer Science, The University of Manchester, Manchester, United Kingdom; Department of Computer Science, The University of Sheffield, Sheffield, United Kingdom.
Institute for Human Development and Potential (IHDP), Agency for Science, Technology and Research (A∗STAR), Singapore, Republic of Singapore; Bioinformatics Institute (BII), Agency for Science, Technology and Research (A∗STAR), Singapore, Republic of Singapore.
EBioMedicine. 2025 May;115:105709. doi: 10.1016/j.ebiom.2025.105709. Epub 2025 Apr 22.
Epigenetic changes in early life play an important role in the development of health conditions in children. Longitudinally measuring and forecasting changes in DNA methylation can reveal patterns of ageing and disease progression, but biosamples may not always be available.
We introduce a probabilistic machine learning framework based on multi-mean Gaussian processes, accounting for individual and gene correlations across time to forecast the methylation status of an individual into the future. Predicted methylation values were used to compute future epigenetic age and compared to chronological age.
We show that this method can simultaneously predict methylation status at multiple genomic sites in children (age 5-7) using methylation data from earlier ages (0-4). Less than 10% difference between observed and predicted methylation values is found in approximately 95% of methylation sites. We show that predicted methylation profiles can be used to estimate other molecular phenotypes, such as epigenetic age, at any timepoint and enable association tests with health outcomes measured at the same timepoint.
Limited longitudinal profiling of DNA methylation coupled with machine learning enables forecasting of epigenetic ageing and future health outcomes.
Wellcome Trust, Singapore National Research Foundation (NRF), Singapore National Medical Research Council (NMRC), Agency for Science, Technology and Research (A∗STAR), UK Academy of Medical Sciences and the UK Engineering and Physical Sciences Research Council (EPSRC).
生命早期的表观遗传变化在儿童健康状况的发展中起着重要作用。纵向测量和预测DNA甲基化的变化可以揭示衰老和疾病进展的模式,但生物样本并非总是可用。
我们引入了一种基于多均值高斯过程的概率机器学习框架,考虑个体和基因随时间的相关性,以预测个体未来的甲基化状态。预测的甲基化值用于计算未来的表观遗传年龄,并与实际年龄进行比较。
我们表明,该方法可以使用早期(0-4岁)的甲基化数据同时预测儿童(5-7岁)多个基因组位点的甲基化状态。在大约95%的甲基化位点中,观察到的和预测的甲基化值之间的差异小于10%。我们表明,预测的甲基化谱可用于在任何时间点估计其他分子表型,如表观遗传年龄,并能够与在同一时间点测量的健康结果进行关联测试。
有限的DNA甲基化纵向分析与机器学习相结合,能够预测表观遗传衰老和未来的健康结果。
惠康信托基金会、新加坡国家研究基金会(NRF)、新加坡国家医学研究理事会(NMRC)、科学技术研究局(A*STAR)、英国医学科学院和英国工程与物理科学研究理事会(EPSRC)。