School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China; Beijing Key Laboratory for Magneto-photoelectrical Composite and Interface Science, University of Science and Technology Beijing, Beijing 100083, China.
School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China.
Comput Methods Programs Biomed. 2019 Apr;171:11-18. doi: 10.1016/j.cmpb.2019.02.010. Epub 2019 Feb 19.
The study of human aging contributes to disease prevention, treatment and life extension. Recently, epigenetics studies have evidenced that there is a close association between DNA methylation and human ages. A quantitatively statistical modeling between DNA methylation and ages could predict the person's age more accurately.
We propose a regression model to predict human age based on gradient boosting regressor (GBR). We collect a total of 1280 publicly available non-blood tissues samples with ages ranged from 0 to 90 years old. We calculate the Pearson correlation between CpG's DNA methylation level and age to select age-related CpGs.
Thirteen age-related CpG sites are selected. GBR has the smallest mean absolute deviation to the actual age comparing with other three different models including Bayesian ridge, multiple linear regression, and support vector regression. In the training datasets, the cross-validation results show that the correlation R between predicted age and DNA methylation is 0.89, and the mean absolute deviation is 4.66 years. In an independent testing set with 262 samples, the GBR achieves the mean absolute deviation of 6.08 years. Meanwhile we also briefly describe the function of the selected thirteen CpG sites.
We build an age predictor to study the association between ages and the DNA methylation of human non-blood tissues. Our new model provides a more accurate estimation of human ages which will be instrumental for understanding the regulation of DNA methylation on human aging and will accurately monitor the individual aging process.
人类衰老的研究有助于疾病的预防、治疗和寿命的延长。最近,表观遗传学研究表明,DNA 甲基化与人类年龄之间存在密切关联。对 DNA 甲基化与年龄之间进行定量统计建模,可以更准确地预测个体的年龄。
我们提出了一种基于梯度提升回归器(GBR)的预测人类年龄的回归模型。我们共收集了 1280 个公开的非血液组织样本,年龄范围从 0 岁到 90 岁。我们计算了 CpG 的 DNA 甲基化水平与年龄之间的 Pearson 相关系数,以选择与年龄相关的 CpG。
选择了 13 个与年龄相关的 CpG 位点。与其他三种不同的模型(贝叶斯岭回归、多元线性回归和支持向量回归)相比,GBR 与实际年龄的平均绝对偏差最小。在训练数据集的交叉验证结果中,预测年龄与 DNA 甲基化之间的相关系数 R 为 0.89,平均绝对偏差为 4.66 岁。在 262 个样本的独立测试集中,GBR 的平均绝对偏差为 6.08 岁。同时,我们还简要描述了所选 13 个 CpG 位点的功能。
我们构建了一个年龄预测器来研究人类非血液组织中年龄与 DNA 甲基化之间的关系。我们的新模型提供了对人类年龄更准确的估计,这对于理解 DNA 甲基化对人类衰老的调控以及准确监测个体衰老过程将具有重要意义。