Guan Zimeng, Wang Jiaqi, Liu Zidong, Yang Chengwen, Xu Xin, Wang Xinjie, Zhang Gengqian
Department of Biotechnology, Biomedical Sciences College, Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, P. R. China.
School of Forensic Medicine, Shanxi Medical University, Jinzhong, Shanxi, P. R. China.
Electrophoresis. 2024 Nov;45(21-22):2012-2018. doi: 10.1002/elps.202400075. Epub 2024 Oct 14.
The analysis of DNA methylation (DNAm) levels at specific CpG sites represents one of the most promising molecular techniques for estimating an individual's age. To date, a considerable number of studies have reported the development of age prediction models on the basis of DNAm in body fluids, with only a few utilizing buccal swabs. The objective of this study was to identify age-dependent methylation CpG sites in three different genes (HOXC4, TRIM59, and ELOVL2) in buccal swab samples from the Chinese Han population. A total of 461 buccal swabs, with an age range of 0.4-80.8 years, were divided into a training set (n = 325) and a validation set (n = 136). Samples were analyzed by pyrosequencing in order to identify age-related genes with correlation coefficient. A random forest regression model was ultimately proposed, including eight CpGs in three genes, with a mean absolute error (MAE) of 2.119 years. The model performs independent validation set with an MAE of 4.391 years. Our findings illustrate that buccal swabs present a suitable alternative to biological traces for age prediction based on DNAm pattern using pyrosequencing and random forest regression, offering the additional advantage of being collected noninvasively.
对特定CpG位点的DNA甲基化(DNAm)水平进行分析,是估计个体年龄最具前景的分子技术之一。迄今为止,大量研究报告了基于体液中DNAm的年龄预测模型的开发情况,仅有少数研究使用了口腔拭子。本研究的目的是在中国汉族人群的口腔拭子样本中,识别三个不同基因(HOXC4、TRIM59和ELOVL2)中与年龄相关的甲基化CpG位点。总共461份口腔拭子样本,年龄范围为0.4至80.8岁,被分为训练集(n = 325)和验证集(n = 136)。通过焦磷酸测序对样本进行分析,以识别具有相关系数的与年龄相关的基因。最终提出了一个随机森林回归模型,该模型包含三个基因中的八个CpG,平均绝对误差(MAE)为2.119岁。该模型在独立验证集中的MAE为4.391岁。我们的研究结果表明,口腔拭子是基于DNAm模式使用焦磷酸测序和随机森林回归进行年龄预测的生物痕迹的合适替代物,具有非侵入性采集的额外优势。