Long Ze, Yi Min, Qin Yong, Ye Qianwen, Che Xiaotong, Wang Shengjie, Lei Mingxing
Department of Orthopedics, The Second Xiangya Hospital of Central South University, Changsha, China.
Institute of Medical Information and Library, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
Front Oncol. 2023 Feb 20;13:1144039. doi: 10.3389/fonc.2023.1144039. eCollection 2023.
Using an ensemble machine learning technique that incorporates the results of multiple machine learning algorithms, the study's objective is to build a reliable model to predict the early mortality among hepatocellular carcinoma (HCC) patients with bone metastases.
We extracted a cohort of 124,770 patients with a diagnosis of hepatocellular carcinoma from the Surveillance, Epidemiology, and End Results (SEER) program and enrolled a cohort of 1897 patients who were diagnosed as having bone metastases. Patients with a survival time of 3 months or less were considered to have had early death. To compare patients with and without early mortality, subgroup analysis was used. Patients were randomly divided into two groups: a training cohort (n = 1509, 80%) and an internal testing cohort (n = 388, 20%). In the training cohort, five machine learning techniques were employed to train and optimize models for predicting early mortality, and an ensemble machine learning technique was used to generate risk probability in a way of soft voting, and it was able to combine the results from the multiply machine learning algorithms. The study employed both internal and external validations, and the key performance indicators included the area under the receiver operating characteristic curve (AUROC), Brier score, and calibration curve. Patients from two tertiary hospitals were chosen as the external testing cohorts (n = 98). Feature importance and reclassification were both operated in the study.
The early mortality was 55.5% (1052/1897). Eleven clinical characteristics were included as input features of machine learning models: sex (p = 0.019), marital status (p = 0.004), tumor stage (p = 0.025), node stage (p = 0.001), fibrosis score (p = 0.040), AFP level (p = 0.032), tumor size (p = 0.001), lung metastases (p < 0.001), cancer-directed surgery (p < 0.001), radiation (p < 0.001), and chemotherapy (p < 0.001). Application of the ensemble model in the internal testing population yielded an AUROC of 0.779 (95% confidence interval [CI]: 0.727-0.820), which was the largest AUROC among all models. Additionally, the ensemble model (0.191) outperformed the other five machine learning models in terms of Brier score. In terms of decision curves, the ensemble model also showed favorable clinical usefulness. External validation showed similar results; with an AUROC of 0.764 and Brier score of 0.195, the prediction performance was further improved after revision of the model. Feature importance demonstrated that the top three most crucial features were chemotherapy, radiation, and lung metastases based on the ensemble model. Reclassification of patients revealed a substantial difference in the two risk groups' actual probabilities of early mortality (74.38% vs. 31.35%, p < 0.001). Patients in the high-risk group had significantly shorter survival time than patients in the low-risk group (p < 0.001), according to the Kaplan-Meier survival curve.
The ensemble machine learning model exhibits promising prediction performance for early mortality among HCC patients with bone metastases. With the aid of routinely accessible clinical characteristics, this model can be a trustworthy prognostic tool to predict the early death of those patients and facilitate clinical decision-making.
本研究采用一种整合多种机器学习算法结果的集成机器学习技术,旨在构建一个可靠的模型,以预测骨转移肝细胞癌(HCC)患者的早期死亡率。
我们从监测、流行病学和最终结果(SEER)计划中提取了124770例诊断为肝细胞癌的患者队列,并纳入了1897例被诊断为骨转移的患者队列。生存时间为3个月或更短的患者被视为早期死亡。为比较有早期死亡和无早期死亡的患者,采用了亚组分析。患者被随机分为两组:训练队列(n = 1509,80%)和内部测试队列(n = 388,20%)。在训练队列中,采用五种机器学习技术训练和优化预测早期死亡的模型,并使用集成机器学习技术以软投票的方式生成风险概率,它能够结合多种机器学习算法的结果。本研究采用了内部和外部验证,关键性能指标包括受试者操作特征曲线下面积(AUROC)、Brier评分和校准曲线。选择两家三级医院的患者作为外部测试队列(n = 98)。研究中还进行了特征重要性分析和重新分类。
早期死亡率为55.5%(1052/1897)。11项临床特征被纳入机器学习模型的输入特征:性别(p = 0.019)、婚姻状况(p = 0.004)、肿瘤分期(p = 0.025)、淋巴结分期(p = 0.001)、纤维化评分(p = 0.040)、甲胎蛋白水平(p = 0.032)、肿瘤大小(p = 0.001)、肺转移(p < 0.001)、针对癌症的手术(p < 0.001)、放疗(p < 0.001)和化疗(p < 0.001)。集成模型在内部测试人群中的应用产生的AUROC为0.779(95%置信区间[CI]:0.727 - 0.820),这是所有模型中最大的AUROC。此外,在Brier评分方面,集成模型(0.191)优于其他五个机器学习模型。在决策曲线方面,集成模型也显示出良好的临床实用性。外部验证显示了类似的结果;AUROC为0.764,Brier评分为0.195,模型修订后预测性能进一步提高。特征重要性分析表明,基于集成模型,最重要的三个特征是化疗、放疗和肺转移。患者重新分类显示,两个风险组早期死亡的实际概率存在显著差异(74.38%对31.35%,p < 0.001)。根据Kaplan-Meier生存曲线,高风险组患者的生存时间明显短于低风险组患者(p < 0.001)。
集成机器学习模型在预测骨转移HCC患者的早期死亡率方面表现出有前景的预测性能。借助常规可获取的临床特征,该模型可以成为预测这些患者早期死亡并促进临床决策的可靠预后工具。