Li Wanling, Liu Jinshan, Lan Yuntong, Yu Dongling, Zhang Bingqiang
Department of Gastroenterology, University-Town Hospital of Chongqing Medical University, Chongqing, 401331, China.
Department of Gastroenterology, Chongqing Hospital of Jiangsu Province Hospital, The People's Hospital of Qijiang District, Chongqing, 401420, China.
Sci Rep. 2025 Apr 15;15(1):12864. doi: 10.1038/s41598-025-95385-0.
This study aims to develop online calculators using machine learning models to predict survival probabilities for early- and late-onset colorectal cancer (EOCRC and LOCRC) over a 1- to 8-year period. We extracted data on 117,965 CRC patients from the published database spanning 2010 to 2021, divided into training and internal testing datasets. The data of 200 CRC patients from Chongqing Hospital of Jiangsu Province Hospital was used as the external testing dataset. We conducted univariate and multivariate regression analyses on the training dataset to identify key survival factors and develop predictive machine learning models. The models were evaluated using internal and external testing datasets based on AUC, accuracy, precision, recall, and F1 score. Web-based calculators were subsequently developed to predict survival curves for EOCRC and LOCRC patients under different treatment strategies. In the multivariate Cox regression analysis, 16 and 18 variables were independently significant survival factors for EOCRC and LOCRC, respectively. In the EOCRC group, the machine learning models achieved AUC values of 0.880 and 0.804 in the internal and external testing cohorts. For the LOCRC group, the machine learning models exhibited AUC values of 0.857 and 0.823 in the internal and external testing cohorts. The online calculators, powered by trained machine learning models, are accessible at https://eocrc-surv.streamlit.app/ and https://locrc-surv.streamlit.app/ . These tools estimate survival probabilities for EOCRC and LOCRC patients under various treatment strategies and display the corresponding survival curves post-treatment over the 1- to 8-year period. This study successfully developed online calculators using machine learning algorithms to predict 1- to 8-year survival probabilities for EOCRC and LOCRC patients under various treatment strategies.
本研究旨在使用机器学习模型开发在线计算器,以预测早发性和晚发性结直肠癌(EOCRC和LOCRC)在1至8年期间的生存概率。我们从2010年至2021年发布的数据库中提取了117,965例CRC患者的数据,分为训练数据集和内部测试数据集。来自江苏省医院重庆医院的200例CRC患者的数据用作外部测试数据集。我们对训练数据集进行了单变量和多变量回归分析,以确定关键生存因素并开发预测性机器学习模型。基于AUC、准确性、精确性、召回率和F1分数,使用内部和外部测试数据集对模型进行评估。随后开发了基于网络的计算器,以预测不同治疗策略下EOCRC和LOCRC患者的生存曲线。在多变量Cox回归分析中,16个和18个变量分别是EOCRC和LOCRC独立的显著生存因素。在EOCRC组中,机器学习模型在内部和外部测试队列中的AUC值分别为0.880和0.804。对于LOCRC组,机器学习模型在内部和外部测试队列中的AUC值分别为0.857和0.823。由经过训练的机器学习模型驱动的在线计算器可在https://eocrc-surv.streamlit.app/和https://locrc-surv.streamlit.app/上访问。这些工具可估计不同治疗策略下EOCRC和LOCRC患者的生存概率,并显示治疗后1至8年期间相应的生存曲线。本研究成功地使用机器学习算法开发了在线计算器,以预测不同治疗策略下EOCRC和LOCRC患者1至8年的生存概率。