Centre for Genomic and Experimental Medicine, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK.
Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA.
Genome Med. 2023 Feb 28;15(1):12. doi: 10.1186/s13073-023-01161-y.
Epigenetic clocks can track both chronological age (cAge) and biological age (bAge). The latter is typically defined by physiological biomarkers and risk of adverse health outcomes, including all-cause mortality. As cohort sample sizes increase, estimates of cAge and bAge become more precise. Here, we aim to develop accurate epigenetic predictors of cAge and bAge, whilst improving our understanding of their epigenomic architecture.
First, we perform large-scale (N = 18,413) epigenome-wide association studies (EWAS) of chronological age and all-cause mortality. Next, to create a cAge predictor, we use methylation data from 24,674 participants from the Generation Scotland study, the Lothian Birth Cohorts (LBC) of 1921 and 1936, and 8 other cohorts with publicly available data. In addition, we train a predictor of time to all-cause mortality as a proxy for bAge using the Generation Scotland cohort (1214 observed deaths). For this purpose, we use epigenetic surrogates (EpiScores) for 109 plasma proteins and the 8 component parts of GrimAge, one of the current best epigenetic predictors of survival. We test this bAge predictor in four external cohorts (LBC1921, LBC1936, the Framingham Heart Study and the Women's Health Initiative study).
Through the inclusion of linear and non-linear age-CpG associations from the EWAS, feature pre-selection in advance of elastic net regression, and a leave-one-cohort-out (LOCO) cross-validation framework, we obtain cAge prediction with a median absolute error equal to 2.3 years. Our bAge predictor was found to slightly outperform GrimAge in terms of the strength of its association to survival (HR = 1.47 [1.40, 1.54] with p = 1.08 × 10, and HR = 1.52 [1.44, 1.59] with p = 2.20 × 10). Finally, we introduce MethylBrowsR, an online tool to visualise epigenome-wide CpG-age associations.
The integration of multiple large datasets, EpiScores, non-linear DNAm effects, and new approaches to feature selection has facilitated improvements to the blood-based epigenetic prediction of biological and chronological age.
表观遗传钟可以追踪年龄(cAge)和生物学年龄(bAge)。后者通常由生理生物标志物和不良健康结果的风险来定义,包括全因死亡率。随着队列样本量的增加,cAge 和 bAge 的估计变得更加精确。在这里,我们旨在开发精确的 cAge 和 bAge 的表观遗传预测因子,同时提高我们对其表观基因组结构的理解。
首先,我们对 18413 名个体的年龄和全因死亡率进行了大规模的(N=18413)全基因组关联研究(EWAS)。接下来,为了创建 cAge 预测因子,我们使用来自苏格兰世代研究、1921 年和 1936 年洛锡安出生队列以及其他 8 个具有公开数据的队列的 24674 名参与者的甲基化数据。此外,我们使用苏格兰世代队列中的全因死亡率预测因子(1214 例观察到的死亡)作为 bAge 的替代物来训练预测因子。为此,我们使用 109 种血浆蛋白的表观遗传替代物(EpiScores)和当前生存最佳的表观遗传预测因子之一 GrimAge 的 8 个组成部分。我们在四个外部队列(LBC1921、LBC1936、弗雷明汉心脏研究和妇女健康倡议研究)中测试了这个 bAge 预测因子。
通过纳入 EWAS 中线性和非线性的年龄-CpG 关联、弹性网络回归前的特征预选择,以及留一队列验证框架,我们得到了 cAge 预测,中位数绝对误差为 2.3 岁。与 GrimAge 相比,我们的 bAge 预测因子在与生存的关联强度方面表现略优(风险比=1.47[1.40,1.54],p=1.08×10-8,风险比=1.52[1.44,1.59],p=2.20×10-10)。最后,我们引入了 MethylBrowsR,这是一个用于可视化全基因组 CpG-年龄关联的在线工具。
通过整合多个大型数据集、EpiScores、非线性 DNAm 效应和新的特征选择方法,我们提高了基于血液的生物和年龄的表观遗传预测能力。