Qi Hongqian, Zhao Hongchen, Li Enyi, Lu Xinyi, Yu Ningbo, Liu Jinchao, Han Jianda
State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin, China.
College of Pharmacy, Nankai University, Tianjin, China.
Aging Cell. 2025 May;24(5):e14471. doi: 10.1111/acel.14471. Epub 2025 Jan 5.
Understanding the complex biological process of aging is of great value, especially as it can help develop therapeutics to prolong healthy life. Predicting biological age from gene expression data has shown to be an effective means to quantify aging of a subject, and to identify molecular and cellular biomarkers of aging. A typical approach for estimating biological age, adopted by almost all existing aging clocks, is to train machine learning models only on healthy subjects, but to infer on both healthy and unhealthy subjects. However, the inherent bias in this approach results in inaccurate biological age as shown in this study. Moreover, almost all existing transcriptome-based aging clocks were built around an inefficient procedure of gene selection followed by conventional machine learning models such as elastic nets, linear discriminant analysis etc. To address these limitations, we proposed DeepQA, a unified aging clock based on mixture of experts. Unlike existing methods, DeepQA is equipped with a specially designed Hinge-Mean-Absolute-Error (Hinge-MAE) loss so that it can train on both healthy and unhealthy subjects of multiple cohorts to reduce the bias of inferring biological age of unhealthy subjects. Our experiments showed that DeepQA significantly outperformed existing methods for biological age estimation on both healthy and unhealthy subjects. In addition, our method avoids the inefficient exhaustive search of genes, and provides a novel means to identify genes activated in aging prediction, alternative to such as differential gene expression analysis.
了解衰老这一复杂的生物学过程具有重要价值,特别是因为它有助于开发延长健康寿命的疗法。从基因表达数据预测生物学年龄已被证明是量化个体衰老程度以及识别衰老的分子和细胞生物标志物的有效手段。几乎所有现有的衰老时钟所采用的一种典型的估计生物学年龄的方法是仅在健康个体上训练机器学习模型,然后对健康和不健康个体进行推断。然而,正如本研究所表明的,这种方法中固有的偏差会导致生物学年龄不准确。此外,几乎所有现有的基于转录组的衰老时钟都是围绕着一种低效的基因选择程序构建的,随后是诸如弹性网络、线性判别分析等传统机器学习模型。为了解决这些局限性,我们提出了DeepQA,一种基于专家混合的统一衰老时钟。与现有方法不同,DeepQA配备了专门设计的铰链平均绝对误差(Hinge-MAE)损失,这样它就可以在多个队列的健康和不健康个体上进行训练,以减少推断不健康个体生物学年龄时的偏差。我们的实验表明,在健康和不健康个体的生物学年龄估计方面,DeepQA显著优于现有方法。此外,我们的方法避免了对基因进行低效的穷举搜索,并提供了一种识别在衰老预测中被激活的基因的新方法,替代了诸如差异基因表达分析等方法。