Ahmed Saleh Sakib, Shabab Nahian, Samee Abul Hassan, Rahman M Sohel
Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palashi, Dhaka 1205, Bangladesh.
Integrative Physiology, Baylor College of Medicine, Houston, TX 77030, USA.
PNAS Nexus. 2025 Jun 3;4(6):pgaf177. doi: 10.1093/pnasnexus/pgaf177. eCollection 2025 Jun.
DNA methylation is a crucial epigenetic marker used in various clocks to predict epigenetic age. However, many existing clocks fail to account for crucial information about CpG sites and their interrelationships, such as co-methylation patterns. We present a novel approach to represent methylation data as a graph, using methylation values and relevant information about CpG sites as nodes, and relationships like co-methylation, same gene, and same chromosome as edges. We then use a graph neural network (GNN) to predict age. Thus our model, GraphAge leverages both the structural and positional information for prediction as well as better interpretation. Although, we had to train in a constrained compute setting, GraphAge still showed competitive performance with a mean absolute error of 3.207 and a mean squared error of 25.277, substantially outperforming the existing models. Perhaps more importantly, we utilized GNN explainer for interpretation purposes and were able to unearth interesting insights (e.g. key CpG sites, pathways and their relationships through methylation regulated networks in the context of aging), which were not possible to "decode" without leveraging the unique capability of GraphAge to "encode" various structural relationships. GraphAge has the potential to consume and utilize all relevant information (if available) about an individual that relates to the complex process of aging. So, in that sense it is one of its kind and can be seen as the first benchmark for a multimodal model which can incorporate all these information in order to close the gap in our understanding of the true nature of aging.
DNA甲基化是一种关键的表观遗传标记,用于各种生物钟来预测表观遗传年龄。然而,许多现有的生物钟未能考虑到关于CpG位点及其相互关系的关键信息,如共甲基化模式。我们提出了一种新颖的方法,将甲基化数据表示为一个图,使用甲基化值和关于CpG位点的相关信息作为节点,以及共甲基化、同一基因和同一染色体等关系作为边。然后我们使用图神经网络(GNN)来预测年龄。因此,我们的模型GraphAge利用了结构和位置信息进行预测,同时具有更好的可解释性。尽管我们必须在受限的计算环境中进行训练,但GraphAge仍然表现出具有竞争力的性能,平均绝对误差为3.207,均方误差为25.277,大大优于现有模型。也许更重要的是,我们利用GNN解释器进行解释,能够挖掘出有趣的见解(例如关键的CpG位点、通路及其在衰老背景下通过甲基化调控网络的关系),如果不利用GraphAge“编码”各种结构关系的独特能力,这些见解是无法“解码”的。GraphAge有潜力消耗和利用与个体衰老复杂过程相关的所有相关信息(如果可用)。因此,从这个意义上说,它是独一无二的,可以被视为多模态模型的第一个基准,该模型可以整合所有这些信息,以缩小我们对衰老真实本质理解上的差距。
Cochrane Database Syst Rev. 2022-3-2
Cochrane Database Syst Rev. 2022-5-20
Cochrane Database Syst Rev. 2022-10-4
Cochrane Database Syst Rev. 2021-4-19
Cochrane Database Syst Rev. 2025-2-19
Cochrane Database Syst Rev. 2020-1-9
Cochrane Database Syst Rev. 2017-12-22
Nat Aging. 2024-9
Nat Aging. 2024-2
Nat Aging. 2023-12
Front Public Health. 2023
Front Cell Dev Biol. 2022-12-22
Gen Comp Endocrinol. 2023-1-1
Ann Palliat Med. 2021-8