Nowinka Zuzanna, Alagha M Abdulhadi, Mahmoud Khadija, Jones Gareth G
MSk Lab, Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, London, United Kingdom.
Data Science Institute, London School of Economics and Political Science, London, United Kingdom.
JMIR Form Res. 2022 Sep 13;6(9):e36130. doi: 10.2196/36130.
Knee osteoarthritis (OA) is the most common form of OA and a leading cause of disability worldwide. Chronic pain and functional loss secondary to knee OA put patients at risk of developing depression, which can also impair their treatment response. However, no tools exist to assist clinicians in identifying patients at risk. Machine learning (ML) predictive models may offer a solution. We investigated whether ML models could predict the development of depression in patients with knee OA and examined which features are the most predictive.
The primary aim of this study was to develop and test an ML model to predict depression in patients with knee OA at 2 years and to validate the models using an external data set. The secondary aim was to identify the most important predictive features used by the ML algorithms.
Osteoarthritis Initiative Study (OAI) data were used for model development and external validation was performed using Multicenter Osteoarthritis Study (MOST) data. Forty-two features were selected, which denoted routinely collected demographic and clinical data such as patient demographics, past medical history, knee OA history, baseline examination findings, and patient-reported outcome measures. Six different ML classification models were trained (logistic regression, least absolute shrinkage and selection operator [LASSO], ridge regression, decision tree, random forest, and gradient boosting machine). The primary outcome was to predict depression at 2 years following study enrollment. The presence of depression was defined using the Center for Epidemiological Studies Depression Scale. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) and F1 score. The most important features were extracted from the best-performing model on external validation.
A total of 5947 patients were included in this study, with 2969 in the training set, 742 in the test set, and 2236 in the external validation set. For the test set, the AUC ranged from 0.673 (95% CI 0.604-0.742) to 0.869 (95% CI 0.824-0.913), with an F1 score of 0.435 to 0.490. On external validation, the AUC varied from 0.720 (95% CI 0.685-0.755) to 0.876 (95% CI 0.853-0.899), with an F1 score of 0.456 to 0.563. LASSO modeling offered the highest predictive performance. Blood pressure, baseline depression score, knee pain and stiffness, and quality of life were the most predictive features.
To our knowledge, this is the first study to apply ML classification models to predict depression in patients with knee OA. Our study showed that ML models can deliver a clinically acceptable level of performance (AUC>0.7) in predicting the development of depression using routinely available demographic and clinical data. Further work is required to address the class imbalance in the training data and to evaluate the clinical utility of the models in facilitating early intervention and improved outcomes.
膝关节骨关节炎(OA)是OA最常见的形式,也是全球致残的主要原因。膝关节OA继发的慢性疼痛和功能丧失使患者有患抑郁症的风险,而抑郁症也会损害他们的治疗反应。然而,目前尚无工具可协助临床医生识别有风险的患者。机器学习(ML)预测模型可能提供一种解决方案。我们研究了ML模型是否可以预测膝关节OA患者抑郁症的发生,并检查哪些特征最具预测性。
本研究的主要目的是开发和测试一个ML模型,以预测膝关节OA患者2年后的抑郁症,并使用外部数据集验证该模型。次要目的是识别ML算法使用的最重要预测特征。
骨关节炎倡议研究(OAI)数据用于模型开发,并使用多中心骨关节炎研究(MOST)数据进行外部验证。选择了42个特征,这些特征表示常规收集的人口统计学和临床数据,如患者人口统计学、既往病史、膝关节OA病史、基线检查结果和患者报告的结局指标。训练了六种不同的ML分类模型(逻辑回归、最小绝对收缩和选择算子[LASSO]、岭回归、决策树、随机森林和梯度提升机)。主要结局是预测研究入组后2年的抑郁症。使用流行病学研究中心抑郁量表定义抑郁症的存在。使用受试者工作特征曲线(AUC)下的面积和F1分数评估模型性能。从外部验证中表现最佳的模型中提取最重要的特征。
本研究共纳入5947例患者,其中训练集2969例,测试集742例,外部验证集2236例。对于测试集,AUC范围为0.673(95%CI 0.604-0.742)至0.869(95%CI 0.824-0.913),F1分数为0.435至0.490。在外部验证中,AUC从0.720(95%CI 0.685-0.755)到0.876(95%CI 0.853-0.899)不等,F1分数为0.456至0.563。LASSO建模提供了最高的预测性能。血压、基线抑郁评分、膝关节疼痛和僵硬以及生活质量是最具预测性的特征。
据我们所知,这是第一项应用ML分类模型预测膝关节OA患者抑郁症的研究。我们的数据表明,ML模型使用常规可用的人口统计学和临床数据预测抑郁症的发生时,能够提供临床可接受的性能水平(AUC>0.7)。需要进一步开展工作来解决训练数据中的类别不平衡问题,并评估这些模型在促进早期干预和改善结局方面的临床效用。