Yagin Fatma Hilal, Yasar Seyma, Gormez Yasin, Yagin Burak, Pinar Abdulvahap, Alkhateeb Abedalrhman, Ardigò Luca Paolo
Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, Malatya 44280, Turkey.
Department of Management Information Systems, Faculty of Economics and Administrative Sciences, Sivas Cumhuriyet University, Sivas 58140, Turkey.
Metabolites. 2023 Dec 18;13(12):1204. doi: 10.3390/metabo13121204.
Diabetic retinopathy (DR), a common ocular microvascular complication of diabetes, contributes significantly to diabetes-related vision loss. This study addresses the imperative need for early diagnosis of DR and precise treatment strategies based on the explainable artificial intelligence (XAI) framework. The study integrated clinical, biochemical, and metabolomic biomarkers associated with the following classes: non-DR (NDR), non-proliferative diabetic retinopathy (NPDR), and proliferative diabetic retinopathy (PDR) in type 2 diabetes (T2D) patients. To create machine learning (ML) models, 10% of the data was divided into validation sets and 90% into discovery sets. The validation dataset was used for hyperparameter optimization and feature selection stages, while the discovery dataset was used to measure the performance of the models. A 10-fold cross-validation technique was used to evaluate the performance of ML models. Biomarker discovery was performed using minimum redundancy maximum relevance (mRMR), Boruta, and explainable boosting machine (EBM). The predictive proposed framework compares the results of eXtreme Gradient Boosting (XGBoost), natural gradient boosting for probabilistic prediction (NGBoost), and EBM models in determining the DR subclass. The hyperparameters of the models were optimized using Bayesian optimization. Combining EBM feature selection with XGBoost, the optimal model achieved (91.25 ± 1.88) % accuracy, (89.33 ± 1.80) % precision, (91.24 ± 1.67) % recall, (89.37 ± 1.52) % F1-Score, and (97.00 ± 0.25) % the area under the ROC curve (AUROC). According to the EBM explanation, the six most important biomarkers in determining the course of DR were tryptophan (Trp), phosphatidylcholine diacyl C42:2 (PC.aa.C42.2), butyrylcarnitine (C4), tyrosine (Tyr), hexadecanoyl carnitine (C16) and total dimethylarginine (DMA). The identified biomarkers may provide a better understanding of the progression of DR, paving the way for more precise and cost-effective diagnostic and treatment strategies.
糖尿病视网膜病变(DR)是糖尿病常见的眼部微血管并发症,是导致糖尿病相关视力丧失的重要原因。本研究基于可解释人工智能(XAI)框架,探讨了早期诊断DR及制定精准治疗策略的迫切需求。该研究整合了2型糖尿病(T2D)患者中与以下类别相关的临床、生化和代谢组学生物标志物:非糖尿病视网膜病变(NDR)、非增殖性糖尿病视网膜病变(NPDR)和增殖性糖尿病视网膜病变(PDR)。为创建机器学习(ML)模型,将10%的数据划分为验证集,90%的数据划分为发现集。验证数据集用于超参数优化和特征选择阶段,而发现数据集用于评估模型性能。采用10折交叉验证技术评估ML模型的性能。使用最小冗余最大相关性(mRMR)、Boruta和可解释增强机器(EBM)进行生物标志物发现。所提出的预测框架比较了极端梯度提升(XGBoost)、用于概率预测的自然梯度提升(NGBoost)和EBM模型在确定DR亚类方面的结果。使用贝叶斯优化对模型的超参数进行优化。将EBM特征选择与XGBoost相结合,最优模型的准确率达到(91.25±1.88)%,精确率为(89.33±1.80)%,召回率为(91.24±1.67)%,F1分数为(89.37±1.52)%,ROC曲线下面积(AUROC)为(97.00±0.25)%。根据EBM的解释,在确定DR病程中最重要的六个生物标志物是色氨酸(Trp)、磷脂酰胆碱二酰基C42:2(PC.aa.C42.2)、丁酰肉碱(C4)、酪氨酸(Tyr)、十六酰肉碱(C16)和总二甲基精氨酸(DMA)。所识别的生物标志物可能有助于更好地理解DR的进展,为更精准且具成本效益的诊断和治疗策略铺平道路。