Feng Sisi, Zhou Manli, Huang Zixin, Xiao Xiaomin, Zhong Baiyun
Department of Clinical Laboratory, Xiangya Hospital, Central South University, Changsha, 410008, Hunan, China.
National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, 410008, Hunan, China.
Clin Exp Med. 2025 May 12;25(1):156. doi: 10.1007/s10238-025-01699-8.
Colorectal liver metastasis (CRLM) is a primary factor contributing to poor prognosis and metastasis in colorectal cancer (CRC) patients. This study aims to develop and validate a machine learning (ML)-based risk prediction model using conventional clinical data to forecast the occurrence of CRLM. This retrospective study analyzed the clinical data of 865 CRC patients between January 2018 and September 2024. Patients were categorized into non-CRLM and CRLM groups. The least absolute shrinkage and selection operator regression was employed to identify key clinical variables, and five ML algorithms were utilized to develop prediction models. The optimal model was selected based on performance metrics including the receiver operating characteristic curve, precision-recall curve, decision curve analysis, and calibration curve, which collectively evaluated both the predictive accuracy and clinical utility of the model. Among the five ML algorithms evaluated, Random forest demonstrated the best performance. Leveraging the Random forest algorithm, we developed the CRLM-Lab6 prediction model, which incorporates six features: LDH, CA199, ALT, CEA, TBIL, and AGR. This model exhibits robust predictive performance, achieving an area under the curve of 0.94, a sensitivity of 0.88, and a specificity of 0.93. To enhance its practical utility, the model has been integrated into an accessible web application. This study developed a novel risk prediction model by integrating ML algorithms with conventional laboratory test data to evaluate the likelihood of CRLM occurrence. The model demonstrates excellent predictive performance and has significant clinical application potential.
结直肠癌肝转移(CRLM)是导致结直肠癌(CRC)患者预后不良和转移的主要因素。本研究旨在开发并验证一种基于机器学习(ML)的风险预测模型,该模型使用传统临床数据来预测CRLM的发生。这项回顾性研究分析了2018年1月至2024年9月期间865例CRC患者的临床数据。患者被分为非CRLM组和CRLM组。采用最小绝对收缩和选择算子回归来识别关键临床变量,并使用五种ML算法来开发预测模型。基于包括受试者工作特征曲线、精确召回率曲线、决策曲线分析和校准曲线在内的性能指标选择最佳模型,这些指标共同评估了模型的预测准确性和临床实用性。在评估的五种ML算法中,随机森林表现出最佳性能。利用随机森林算法,我们开发了CRLM-Lab6预测模型,该模型包含六个特征:乳酸脱氢酶(LDH)、糖类抗原199(CA199)、谷丙转氨酶(ALT)、癌胚抗原(CEA)、总胆红素(TBIL)和AGR。该模型具有强大的预测性能,曲线下面积为0.94,灵敏度为0.88,特异性为0.93。为了提高其实用性,该模型已集成到一个可访问的网络应用程序中。本研究通过将ML算法与传统实验室检测数据相结合,开发了一种新型风险预测模型,以评估CRLM发生的可能性。该模型具有出色的预测性能,具有显著的临床应用潜力。