Gosselt Helen R, Verhoeven Maxime M A, Bulatović-Ćalasan Maja, Welsing Paco M, de Rotte Maurits C F J, Hazes Johanna M W, Lafeber Floris P J G, Hoogendoorn Mark, de Jonge Robert
Department of Clinical Chemistry, Amsterdam Gastroenterology and Metabolism, Amsterdam UMC, VUmc, 1081 HV Amsterdam, The Netherlands.
Department of Clinical Chemistry, Erasmus MC, University Medical Center Rotterdam, 3015 GD Rotterdam, The Netherlands.
J Pers Med. 2021 Jan 14;11(1):44. doi: 10.3390/jpm11010044.
The goals of this study were to examine whether machine-learning algorithms outperform multivariable logistic regression in the prediction of insufficient response to methotrexate (MTX); secondly, to examine which features are essential for correct prediction; and finally, to investigate whether the best performing model specifically identifies insufficient responders to MTX (combination) therapy. The prediction of insufficient response (3-month Disease Activity Score 28-Erythrocyte-sedimentation rate (DAS28-ESR) > 3.2) was assessed using logistic regression, least absolute shrinkage and selection operator (LASSO), random forest, and extreme gradient boosting (XGBoost). The baseline features of 355 rheumatoid arthritis (RA) patients from the "treatment in the Rotterdam Early Arthritis CoHort" (tREACH) and the U-Act-Early trial were combined for analyses. The model performances were compared using area under the curve (AUC) of receiver operating characteristic (ROC) curves, 95% confidence intervals (95% CI), and sensitivity and specificity. Finally, the best performing model following feature selection was tested on 101 RA patients starting tocilizumab (TCZ)-monotherapy. Logistic regression (AUC = 0.77 95% CI: 0.68-0.86) performed as well as LASSO (AUC = 0.76, 95% CI: 0.67-0.85), random forest (AUC = 0.71, 95% CI: 0.61 = 0.81), and XGBoost (AUC = 0.70, 95% CI: 0.61-0.81), yet logistic regression reached the highest sensitivity (81%). The most important features were baseline DAS28 (components). For all algorithms, models with six features performed similarly to those with 16. When applied to the TCZ-monotherapy group, logistic regression's sensitivity significantly dropped from 83% to 69% ( = 0.03). In the current dataset, logistic regression performed equally well compared to machine-learning algorithms in the prediction of insufficient response to MTX. Models could be reduced to six features, which are more conducive for clinical implementation. Interestingly, the prediction model was specific to MTX (combination) therapy response.
本研究的目的是检验机器学习算法在预测甲氨蝶呤(MTX)反应不足方面是否优于多变量逻辑回归;其次,检验哪些特征对于正确预测至关重要;最后,研究表现最佳的模型是否能特异性识别MTX(联合)治疗的反应不足者。使用逻辑回归、最小绝对收缩和选择算子(LASSO)、随机森林和极端梯度提升(XGBoost)评估反应不足(3个月疾病活动评分28-红细胞沉降率(DAS28-ESR)>3.2)的预测情况。将来自“鹿特丹早期关节炎队列治疗”(tREACH)和U-Act-Early试验的355例类风湿关节炎(RA)患者的基线特征合并进行分析。使用受试者操作特征(ROC)曲线的曲线下面积(AUC)、95%置信区间(95%CI)以及敏感性和特异性比较模型性能。最后,在101例开始托珠单抗(TCZ)单药治疗的RA患者中测试特征选择后表现最佳的模型。逻辑回归(AUC = 0.77,95%CI:0.68-0.86)与LASSO(AUC = 0.76,95%CI:0.67-0.85)、随机森林(AUC = 0.71,95%CI:0.61 = 0.81)和XGBoost(AUC = 0.70,95%CI:0.61-0.81)表现相当,但逻辑回归达到了最高敏感性(81%)。最重要的特征是基线DAS28(组成部分)。对于所有算法,具有6个特征的模型与具有16个特征的模型表现相似。当应用于TCZ单药治疗组时,逻辑回归的敏感性从83%显著降至69%(P = 0.03)。在当前数据集中,逻辑回归在预测MTX反应不足方面与机器学习算法表现相当。模型可以简化为6个特征,这更有利于临床应用。有趣的是,该预测模型对MTX(联合)治疗反应具有特异性。