Department of Computer Science, Sultan Qaboos University, Muscat, Oman.
Department of Medicine, College of Medicine, Sultan Qaboos University, Muscat, Oman.
Sultan Qaboos Univ Med J. 2023 Aug;23(3):328-335. doi: 10.18295/squmj.12.2022.069. Epub 2023 Aug 28.
This study aimed to design a machine learning-based prediction framework to predict the presence or absence of systemic lupus erythematosus (SLE) in a cohort of Omani patients.
Data of 219 patients from 2006 to 2019 were extracted from Sultan Qaboos University Hospital's electronic records. Among these, 138 patients had SLE, while the remaining 81 had other rheumatologic diseases. Clinical and demographic features were analysed to focus on the early stages of the disease. Recursive feature selection was implemented to choose the most informative features. The CatBoost classification algorithm was utilised to predict SLE, and the SHAP explainer algorithm was applied on top of the CatBoost model to provide individual prediction reasoning, which was then validated by rheumatologists.
CatBoost achieved an area under the receiver operating characteristic curve score of 0.95 and a sensitivity of 92%. The SHAP algorithm identified four clinical features (alopecia, renal disorders, acute cutaneous lupus and haemolytic anaemia) and the patient's age as having the greatest contribution to the prediction.
An explainable framework to predict SLE in patients and provide reasoning for its prediction was designed and validated. This framework enables clinicians to implement early interventions that will lead to positive healthcare outcomes.
本研究旨在设计一个基于机器学习的预测框架,以预测 2006 年至 2019 年期间在阿曼患者队列中系统性红斑狼疮 (SLE) 的存在或缺失。
从苏丹卡布斯大学医院的电子病历中提取了 219 名患者的数据。其中,138 名患者患有 SLE,其余 81 名患者患有其他风湿性疾病。分析了临床和人口统计学特征,重点关注疾病的早期阶段。实施递归特征选择以选择最具信息量的特征。利用 CatBoost 分类算法预测 SLE,并在 CatBoost 模型上应用 SHAP 解释器算法提供个体预测推理,然后由风湿病学家进行验证。
CatBoost 获得了 0.95 的接收器操作特征曲线评分和 92%的灵敏度。SHAP 算法确定了四个临床特征(脱发、肾脏疾病、急性皮肤狼疮和溶血性贫血)和患者年龄对预测的贡献最大。
设计并验证了一种用于预测 SLE 患者的可解释框架,并提供了其预测的推理依据。该框架使临床医生能够实施早期干预措施,从而带来积极的医疗保健结果。