Martínez-Enguita David, Hillerton Thomas, Åkesson Julia, Kling Daniel, Lerm Maria, Gustafsson Mika
Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden.
Department of Forensic Genetics and Toxicology, Swedish National Board of Forensic Medicine, Linköping, Sweden.
Front Aging. 2025 Jan 23;5:1526146. doi: 10.3389/fragi.2024.1526146. eCollection 2024.
DNA methylation (DNAm) age clocks are powerful tools for measuring biological age, providing insights into aging risks and outcomes beyond chronological age. While traditional models are effective, their interpretability is limited by their dependence on small and potentially stochastic sets of CpG sites. Here, we propose that the reliability of DNAm age clocks should stem from their capacity to detect comprehensive and targeted aging signatures.
We compiled publicly available DNAm whole-blood samples (n = 17,726) comprising the entire human lifespan (0-112 years). We used a pre-trained network-coherent autoencoder (NCAE) to compress DNAm data into embeddings, with which we trained interpretable neural network epigenetic clocks. We then retrieved their age-specific epigenetic signatures of aging and examined their functional enrichments in age-associated biological processes.
We introduce NCAE-CombClock, a novel highly precise (R = 0.978, mean absolute error = 1.96 years) deep neural network age clock integrating data-driven DNAm embeddings and established CpG age markers. Additionally, we developed a suite of interpretable NCAE-Age neural network classifiers tailored for adolescence and young adulthood. These clocks can accurately classify individuals at critical developmental ages in youth (AUROC = 0.953, 0.972, and 0.927, for 15, 18, and 21 years) and capture fine-grained, single-year DNAm signatures of aging that are enriched in biological processes associated with anatomic and neuronal development, immunoregulation, and metabolism. We showcased the practical applicability of this approach by identifying candidate mechanisms underlying the altered pace of aging observed in pediatric Crohn's disease.
In this study, we present a deep neural network epigenetic clock, named NCAE-CombClock, that improves age prediction accuracy in large datasets, and a suite of explainable neural network clocks for robust age classification across youth. Our models offer broad applications in personalized medicine and aging research, providing a valuable resource for interpreting aging trajectories in health and disease.
DNA甲基化(DNAm)年龄时钟是测量生物年龄的有力工具,能深入了解超越实际年龄的衰老风险和结果。虽然传统模型很有效,但其可解释性受限于对少量且可能具有随机性的CpG位点集的依赖。在此,我们提出DNAm年龄时钟的可靠性应源于其检测全面且有针对性的衰老特征的能力。
我们汇编了公开可用的DNAm全血样本(n = 17,726),涵盖人类整个寿命范围(0 - 112岁)。我们使用预训练的网络相干自动编码器(NCAE)将DNAm数据压缩为嵌入向量,并用这些向量训练可解释的神经网络表观遗传时钟。然后,我们检索其特定年龄的衰老表观遗传特征,并检查它们在与年龄相关的生物过程中的功能富集情况。
我们推出了NCAE - CombClock,这是一种新型的高精度(R = 0.978,平均绝对误差 = 1.96岁)深度神经网络年龄时钟,它整合了数据驱动的DNAm嵌入向量和已建立的CpG年龄标记。此外,我们开发了一套专为青少年和青年期量身定制的可解释的NCAE - Age神经网络分类器。这些时钟能够准确地对青年关键发育年龄的个体进行分类(15岁、18岁和21岁时的受试者工作特征曲线下面积分别为0.953、0.972和0.927),并捕捉到在与解剖和神经元发育、免疫调节及代谢相关的生物过程中富集的细粒度、逐年的DNAm衰老特征。我们通过确定小儿克罗恩病中观察到的衰老速度改变的潜在机制,展示了这种方法的实际适用性。
在本研究中,我们展示了一种名为NCAE - CombClock的深度神经网络表观遗传时钟,它提高了大型数据集中年龄预测的准确性,以及一套用于在青年期进行稳健年龄分类的可解释神经网络时钟。我们的模型在个性化医疗和衰老研究中有广泛应用,为解释健康和疾病中的衰老轨迹提供了宝贵资源。