Qin Jia-Jia, Zhu Xiao-Xiao, Chen Xi, Sang Wei, Jin Ying-Liang
Department of Medical Public Health, Center for Medical Statistics and Data Analysis of Xuzhou Medical University, Xuzhou, China.
Department of Hematology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China.
Transl Cancer Res. 2024 Jul 31;13(7):3370-3381. doi: 10.21037/tcr-23-2358. Epub 2024 Jul 26.
The incidence of diffuse large B-cell lymphoma (DLBCL) in children is increasing globally. Due to the immature immune system in children, the prognosis of DLBCL is quite different from that of adults. We aim to use the multicenter large retrospective analysis for prognosis study of the disease.
For our retrospective analysis, we retrieved data from the Surveillance, Epidemiology and End Results (SEER) database that included 836 DLBCL patients under 18 years old who were treated at 22 central institutions between 2000 and 2019. The patients were randomly divided into a modeling group and a validation group based on the ratio of 7:3. Cox stepwise regression, generalized Cox regression and eXtreme Gradient Boosting (XGBoost) were used to screen all variables. The selected prognostic variables were used to construct a nomogram through Cox stepwise regression. The importance of variables was ranked using XGBoost. The predictive performance of the model was assessed by using C-index, area under the curve (AUC) of receiver operating characteristic (ROC) curve, sensitivity and specificity. The consistency of the model was evaluated by using a calibration curve. The clinical practicality of the model was verified through decision curve analysis (DCA).
ROC curve demonstrated that all models except the non-proportional hazards and non-log linearity (NPHNLL) model, achieved AUC values above 0.7, indicating high accuracy. The calibration curve and DCA further confirmed strong predictive performance and clinical practicability.
In this study, we successfully constructed a machine learning model by combining XGBoost with Cox and generalized Cox regression models. This integrated approach accurately predicts the prognosis of children with DLBCL from multiple dimensions. These findings provide a scientific basis for accurate clinical prognosis prediction.
儿童弥漫性大B细胞淋巴瘤(DLBCL)的发病率在全球范围内呈上升趋势。由于儿童免疫系统不成熟,DLBCL的预后与成人有很大不同。我们旨在通过多中心大型回顾性分析来研究该疾病的预后。
在我们的回顾性分析中,我们从监测、流行病学和最终结果(SEER)数据库中检索数据,该数据库包含2000年至2019年期间在22个中心机构接受治疗的836例18岁以下的DLBCL患者。根据7:3的比例将患者随机分为建模组和验证组。使用Cox逐步回归、广义Cox回归和极端梯度提升(XGBoost)来筛选所有变量。通过Cox逐步回归使用选定的预后变量构建列线图。使用XGBoost对变量的重要性进行排名。通过使用C指数、受试者操作特征(ROC)曲线的曲线下面积(AUC)、敏感性和特异性来评估模型的预测性能。通过校准曲线评估模型的一致性。通过决策曲线分析(DCA)验证模型的临床实用性。
ROC曲线表明,除了非比例风险和非对数线性(NPHNLL)模型外,所有模型的AUC值均高于0.7,表明准确性较高。校准曲线和DCA进一步证实了强大的预测性能和临床实用性。
在本研究中,我们成功地将XGBoost与Cox和广义Cox回归模型相结合,构建了一个机器学习模型。这种综合方法从多个维度准确预测了儿童DLBCL的预后。这些发现为准确的临床预后预测提供了科学依据。