Liu Hailang, Wang Xinguang, Tang Kun, Peng Ejun, Xia Ding, Chen Zhiqiang
Department of Urology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.
Transl Androl Urol. 2021 Feb;10(2):710-723. doi: 10.21037/tau-20-1208.
To develop a machine learning (ML)-assisted model capable of accurately identifying patients with calculous pyonephrosis before making treatment decisions by integrating multiple clinical characteristics.
We retrospectively collected data from patients with obstructed hydronephrosis who underwent retrograde ureteral stent insertion, percutaneous nephrostomy (PCN), or percutaneous nephrolithotomy (PCNL). The study cohort was divided into training and testing datasets in a 70:30 ratio for further analysis. We developed 5 ML-assisted models from 22 clinical features using logistic regression (LR), LR optimized by least absolute shrinkage and selection operator (Lasso) regularization (Lasso-LR), support vector machine (SVM), extreme gradient boosting (XGBoost), and random forest (RF). The area under the curve (AUC) was applied to determine the model with the highest discrimination. Decision curve analysis (DCA) was used to investigate the clinical net benefit associated with using the predictive models.
A total of 322 patients were included, with 225 patients in the training dataset, and 97 patients in the testing dataset. The XGBoost model showed good discrimination with the AUC, accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of 0.981, 0.991, 0.962, 1.000, 1.000, and 0.989, respectively, followed by SVM [AUC =0.985, 95% confidence interval (CI): 0.970-1.000], Lasso-LR (AUC =0.977, 95% CI: 0.958-0.996), LR (AUC =0.936, 95% CI: 0.905-0.968), and RF (AUC =0.920, 95% CI: 0.870-0.970). Validation of the model showed that SVM yielded the highest AUC (0.977, 95% CI: 0.952-1.000), followed by Lasso-LR (AUC =0.959, 95% CI: 0.921-0.997), XGBoost (AUC =0.958, 95% CI: 0.902-1.000), LR (AUC =0.932, 95% CI: 0.878-0.987), and RF (AUC =0.868, 95% CI: 0.779-0.958) in the testing dataset.
Our ML-based models had good discrimination in predicting patients with obstructed hydronephrosis at high risk of harboring pyonephrosis, and the use of these models may be greatly beneficial to urologists in treatment planning, patient selection, and decision-making.
开发一种机器学习(ML)辅助模型,通过整合多种临床特征,在做出治疗决策前能够准确识别患有结石性脓肾的患者。
我们回顾性收集了接受逆行输尿管支架置入术、经皮肾造瘘术(PCN)或经皮肾镜取石术(PCNL)的梗阻性肾积水患者的数据。研究队列以70:30的比例分为训练集和测试集用于进一步分析。我们从22个临床特征开发了5种ML辅助模型,使用逻辑回归(LR)、通过最小绝对收缩和选择算子(Lasso)正则化优化的LR(Lasso-LR)、支持向量机(SVM)、极端梯度提升(XGBoost)和随机森林(RF)。曲线下面积(AUC)用于确定具有最高区分度的模型。决策曲线分析(DCA)用于研究使用预测模型相关的临床净效益。
共纳入322例患者,训练集中有225例患者,测试集中有97例患者。XGBoost模型显示出良好的区分度,其AUC、准确率、敏感性、特异性、阳性预测值(PPV)和阴性预测值(NPV)分别为0.981、0.991、0.962、1.000、1.000和0.989,其次是SVM[AUC =0.985,95%置信区间(CI):0.970 - 1.000]、Lasso-LR(AUC =0.977,95% CI:0.958 - 0.996)、LR(AUC =0.936,95% CI:0.905 - 0.968)和RF(AUC =0.920,95% CI:0.870 - 0.970)。模型验证显示,在测试集中SVM的AUC最高(0.977,95% CI:0.952 - 1.000),其次是Lasso-LR(AUC =0.959,95% CI:0.921 - 0.997)、XGBoost(AUC =0.958,95% CI:0.902 - 1.000)、LR(AUC =0.932,95% CI:0.878 - 0.987)和RF(AUC =0.868,95% CI:0.779 - 0.958)。
我们基于ML的模型在预测有患脓肾高风险的梗阻性肾积水患者方面具有良好的区分度,使用这些模型可能对泌尿外科医生在治疗规划、患者选择和决策方面有很大帮助。