Yagin Fatma Hilal, Shateri Ahmadreza, Nasiri Hamid, Yagin Burak, Colak Cemil, Alghannam Abdullah F
Department of Biostatistics and Medical Informatics, Inonu University, Malatya, Türkiye.
Electrical and Computer Engineering Department, Semnan University, Semnan, Iran.
PeerJ Comput Sci. 2024 Mar 20;10:e1857. doi: 10.7717/peerj-cs.1857. eCollection 2024.
Myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) is a severe condition with an uncertain origin and a dismal prognosis. There is presently no precise diagnostic test for ME/CFS, and the diagnosis is determined primarily by the presence of certain symptoms. The current study presents an explainable artificial intelligence (XAI) integrated machine learning (ML) framework that identifies and classifies potential metabolic biomarkers of ME/CFS. Metabolomic data from blood samples from 19 controls and 32 ME/CFS patients, all female, who were between age and body mass index (BMI) frequency-matched groups, were used to develop the XAI-based model. The dataset contained 832 metabolites, and after feature selection, the model was developed using only 50 metabolites, meaning less medical knowledge is required, thus reducing diagnostic costs and improving prognostic time. The computational method was developed using six different ML algorithms before and after feature selection. The final classification model was explained using the XAI approach, SHAP. The best-performing classification model (XGBoost) achieved an area under the receiver operating characteristic curve (AUCROC) value of 98.85%. SHAP results showed that decreased levels of alpha-CEHC sulfate, hypoxanthine, and phenylacetylglutamine, as well as increased levels of N-delta-acetylornithine and oleoyl-linoloyl-glycerol (18:1/18:2)[2], increased the risk of ME/CFS. Besides the robustness of the methodology used, the results showed that the combination of ML and XAI could explain the biomarker prediction of ME/CFS and provided a first step toward establishing prognostic models for ME/CFS.
肌痛性脑脊髓炎/慢性疲劳综合征(ME/CFS)是一种病因不明、预后不佳的严重疾病。目前尚无针对ME/CFS的精确诊断测试,诊断主要取决于某些症状的存在。当前的研究提出了一种可解释人工智能(XAI)集成机器学习(ML)框架,该框架可识别和分类ME/CFS的潜在代谢生物标志物。来自19名对照和32名ME/CFS患者的血液样本代谢组学数据用于开发基于XAI的模型,所有患者均为女性,年龄和体重指数(BMI)频率匹配。该数据集包含832种代谢物,经过特征选择后,仅使用50种代谢物开发模型,这意味着所需的医学知识更少,从而降低了诊断成本并缩短了预后时间。在特征选择前后,使用六种不同的ML算法开发了计算方法。使用XAI方法SHAP对最终分类模型进行了解释。表现最佳的分类模型(XGBoost)的受试者工作特征曲线下面积(AUCROC)值达到98.85%。SHAP结果表明,硫酸α-CEHC、次黄嘌呤和苯乙酰谷氨酰胺水平降低,以及N-δ-乙酰鸟氨酸和油酰-亚油酰甘油(18:1/18:2)[2]水平升高,会增加患ME/CFS的风险。除了所用方法的稳健性外,结果表明ML和XAI的结合可以解释ME/CFS的生物标志物预测,并为建立ME/CFS的预后模型迈出了第一步。