Luo Chao, Li Shuqi, Zhao Qin, Ou Qiaowen, Huang Wenjie, Ruan Guangying, Liang Shaobo, Liu Lizhi, Zhang Yu, Li Haojiang
Department of Radiology, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangzhou, Guangdong, People's Republic of China.
Department of Clinical Nutrition, The First Affiliated Hospital of Guangdong Pharmaceutical University, Guangzhou, Guangdong, People's Republic of China.
J Inflamm Res. 2022 Aug 24;15:4803-4815. doi: 10.2147/JIR.S366922. eCollection 2022.
Traditional prognostic studies utilized different cut-off values, without evaluating potential information contained in inflammation-related hematological indicators. Using the interpretable machine-learning algorithm RuleFit, this study aimed to explore valuable inflammatory rules reflecting prognosis in nasopharyngeal carcinoma (NPC) patients.
In total, 1706 biopsy-proven NPC patients treated in two independent hospitals (1320 and 386) between January 2010 and March 2014 were included. RuleFit was used to develop risk-predictive rules using hematological indicators with no distributive difference between the two centers. Time-event-dependent hematological rules were further selected by stepwise multivariate Cox analysis. Combining high-efficiency hematological rules and clinical predictors, a final model was established. Models based on other algorithms (AutoML, Lasso) and clinical predictors were built for comparison, as well as a reported nomogram. Area under the receiver operating characteristic curve (AUROC) and concordance index (C-index) were used to verify the predictive precision of different models. A site-based app was established for convenience.
RuleFit identified 22 combined baseline hematological rules, achieving AUROCs of 0.69 and 0.64 in the training and validation cohorts, respectively. By contrast, the AUROCs of the optimal contrast model based on AutoML were 1.00 and 0.58. For overall survival, the final model had a much higher C-index than the base model using TN staging in two cohorts (0.769 vs 0.717, <0.001; 0.752 vs 0.688, <0.001), and showing great generalizability in training and validation cohorts. The two models based on RuleFit rules performed best, compared with other models. As for other endpoints, the final model showed a similar trend. Kaplan-Meier curve exhibited 22.9% (390/1706) patients were "misclassified" by AJCC staging, but the final model could assess risk classification accurately.
The proposed final models based on inflammation-related rules based on RuleFit showed significantly elevated predictive performance.
传统的预后研究采用不同的截断值,未评估炎症相关血液学指标中包含的潜在信息。本研究使用可解释的机器学习算法RuleFit,旨在探索反映鼻咽癌(NPC)患者预后的有价值的炎症规则。
纳入2010年1月至2014年3月期间在两家独立医院(分别为1320例和386例)接受活检证实的1706例NPC患者。使用RuleFit利用两个中心之间无分布差异的血液学指标制定风险预测规则。通过逐步多变量Cox分析进一步选择时间事件相关的血液学规则。结合高效的血液学规则和临床预测指标,建立最终模型。构建基于其他算法(自动机器学习、套索)和临床预测指标的模型进行比较,以及一个已报道的列线图。使用受试者操作特征曲线下面积(AUROC)和一致性指数(C指数)来验证不同模型的预测精度。为方便起见,建立了一个基于网站的应用程序。
RuleFit识别出22条综合基线血液学规则,在训练队列和验证队列中的AUROC分别为0.69和0.64。相比之下,基于自动机器学习的最佳对照模型的AUROC分别为1.00和0.58。对于总生存期,最终模型在两个队列中的C指数远高于使用TN分期的基础模型(0.769对0.717,<0.001;0.752对0.688,<0.001),并且在训练和验证队列中显示出很大的可推广性。与其他模型相比,基于RuleFit规则的两个模型表现最佳。对于其他终点,最终模型显示出类似趋势。Kaplan-Meier曲线显示,22.9%(390/1706)的患者被AJCC分期“错误分类”,但最终模型可以准确评估风险分类。
所提出的基于RuleFit的炎症相关规则的最终模型显示出显著提高的预测性能。