Colak Cemil, Yagin Fatma Hilal, Yagin Burak, Alkhateeb Abedalrhman, Al-Rawi Mahmood Basil A, Akhloufi Moulay A, Aghaei Mohammadreza
Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya, Türkiye.
Department of Computer Science, Lakehead University, Thunder Bay, ON, Canada.
Front Mol Biosci. 2025 Apr 9;12:1567199. doi: 10.3389/fmolb.2025.1567199. eCollection 2025.
This study aims to develop an explainable artificial intelligence (XAI) model integrated with machine learning (ML) to comprehensively investigate metabolic differences between individuals with Down syndrome (T21) and healthy controls (D21) and to identify novel/pathway-specific biomarkers. In this study, ML classifiers including AdaBoost, LightGBM, Random Forest, KTBoost, and XGBoost are applied to metabolomics data obtained from metabolomic analyses by high-resolution liquid chromatography-mass spectrometry (LC-MS) using blood plasma samples of 316 T21 and 103 D21 individuals, and the importance of metabolites is evaluated by XAI-based SHAP analysis. The KTBoost model shows the highest classification performance with an accuracy of 90.4% and area under the curve (AUC) of 95.9%, outperforming AdaBoost, LightGBM, Random Forest, and XGBoost. Significant downregulation and upregulation of some metabolites were observed in the T21 group compared to the D21 group. Metabolites such as vitamin C, taurolithocholic acid, sphingosine, and prostaglandin A2/B2/J2 are observed at low levels in the T21 group. In contrast, metabolites such as thymidine, tau-roursodeoxycholic acid, serine, and nervonic acid are elevated. SHAP analysis revealed that L-Citrulline, Kynurenin, Prostaglandin A2/B2/J2, Urate, and Pantothenate metabolites could be novel/pathway-specific biomarkers to differentiate the T21 group. This study revealed significant metabolic alterations in individuals with T21 and demonstrated the effectiveness of the combination of ML and XAI methods to identify novel/pathway-specific biomarkers. The findings may contribute to a better understanding of Down syndrome's molecular mechanisms and the development of future diagnostic and therapeutic strategies.
本研究旨在开发一种与机器学习(ML)相结合的可解释人工智能(XAI)模型,以全面研究唐氏综合征(T21)患者与健康对照(D21)之间的代谢差异,并识别新的/特定途径的生物标志物。在本研究中,将包括AdaBoost、LightGBM、随机森林、KTBoost和XGBoost在内的ML分类器应用于通过高分辨率液相色谱 - 质谱(LC - MS)代谢组学分析获得的代谢组学数据,这些数据来自316名T21个体和103名D21个体的血浆样本,并通过基于XAI的SHAP分析评估代谢物的重要性。KTBoost模型表现出最高的分类性能,准确率为90.4%,曲线下面积(AUC)为95.9%,优于AdaBoost、LightGBM、随机森林和XGBoost。与D21组相比,T21组中一些代谢物出现了显著的下调和上调。在T21组中,维生素C、牛磺石胆酸、鞘氨醇和前列腺素A2/B2/J2等代谢物水平较低。相反,胸苷、牛磺熊去氧胆酸、丝氨酸和神经酸等代谢物水平升高。SHAP分析表明,L - 瓜氨酸、犬尿氨酸、前列腺素A2/B2/J2、尿酸盐和泛酸盐代谢物可能是区分T21组的新的/特定途径的生物标志物。本研究揭示了T21个体中显著的代谢改变,并证明了ML和XAI方法相结合用于识别新的/特定途径的生物标志物的有效性。这些发现可能有助于更好地理解唐氏综合征的分子机制以及未来诊断和治疗策略的开发。