Department of Radiation Oncology, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
Department of Radiation Physics, Unit 1150, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, TX, 77030, USA.
Sci Rep. 2022 Jun 2;12(1):9178. doi: 10.1038/s41598-022-12898-8.
This study aimed to compare the predictive performance of different modeling methods in developing normal tissue complication probability (NTCP) models for predicting radiation-induced esophagitis (RE) in non-small cell lung cancer (NSCLC) patients receiving proton radiotherapy. The dataset was composed of 328 NSCLC patients receiving passive-scattering proton therapy and 41.6% of the patients experienced ≥ grade 2 RE. Five modeling methods were used to build NTCP models: standard Lyman-Kutcher-Burman (sLKB), generalized LKB (gLKB), multivariable logistic regression using two variable selection procedures-stepwise forward selection (Stepwise-MLR), and least absolute shrinkage and selection operator (LASSO-MLR), and support vector machines (SVM). Predictive performance was internally validated by a bootstrap approach for each modeling method. The overall performance, discriminative ability, and calibration were assessed using the Negelkerke R, area under the receiver operator curve (AUC), and Hosmer-Lemeshow test, respectively. The LASSO-MLR model showed the best discriminative ability with an AUC value of 0.799 (95% confidence interval (CI): 0.763-0.854), and the best overall performance with a Negelkerke R value of 0.332 (95% CI: 0.266-0.486). Both of the optimism-corrected Negelkerke R values of the SVM and sLKB models were 0.301. The optimism-corrected AUC of the gLKB model (0.796) was higher than that of the SVM model (0.784). The sLKB model had the smallest optimism in the model variation and discriminative ability. In the context of classification and probability estimation for predicting the NTCP for radiation-induced esophagitis, the MLR model developed with LASSO provided the best predictive results. The simplest LKB modeling had similar or even better predictive performance than the most complex SVM modeling, and it was least likely to overfit the training data. The advanced machine learning approach might have limited applicability in clinical settings with a relatively small amount of data.
本研究旨在比较不同建模方法在开发用于预测接受质子放射治疗的非小细胞肺癌(NSCLC)患者放射性食管炎(RE)的正常组织并发症概率(NTCP)模型中的预测性能。该数据集由 328 名接受被动散射质子治疗的 NSCLC 患者组成,其中 41.6%的患者发生了≥2 级 RE。使用五种建模方法来构建 NTCP 模型:标准 Lyman-Kutcher-Burman(sLKB)、广义 LKB(gLKB)、使用两种变量选择程序逐步向前选择(Stepwise-MLR)的多变量逻辑回归、最小绝对收缩和选择算子(LASSO-MLR)以及支持向量机(SVM)。使用自举方法对每种建模方法进行内部验证。使用 Negelkerke R、接收者操作特征曲线下面积(AUC)和 Hosmer-Lemeshow 检验分别评估整体性能、判别能力和校准。LASSO-MLR 模型表现出最佳的判别能力,AUC 值为 0.799(95%置信区间(CI):0.763-0.854),最佳整体性能的 Negelkerke R 值为 0.332(95% CI:0.266-0.486)。SVM 和 sLKB 模型的优化校正 Negelkerke R 值均为 0.301。gLKB 模型的优化校正 AUC(0.796)高于 SVM 模型(0.784)。sLKB 模型在模型变化和判别能力方面具有最小的优化。在分类和概率估计的背景下,用于预测放射性食管炎的 NTCP,基于 LASSO 的 MLR 模型提供了最佳的预测结果。最简单的 LKB 建模具有与最复杂的 SVM 建模相似甚至更好的预测性能,并且最不可能过度拟合训练数据。在数据量相对较少的临床环境中,先进的机器学习方法的适用性可能有限。