Department of Data Statistics, Graduate School, Korea University, Seoul, Republic of Korea.
Medical Big Data Research Center, Research Institute of Clinical Medicine, Kyung Hee University Hospital at Gangdong, Seoul, Republic of Korea.
Medicine (Baltimore). 2024 Jun 14;103(24):e38584. doi: 10.1097/MD.0000000000038584.
The investigation into individual survival rates within the patient population was typically conducted using the Cox proportional hazards model. This study was aimed to evaluate the performance of machine learning algorithm in predicting survival rates more than 5 years for individual patients with colorectal cancer. A total of 475 patients with colorectal cancer (CRC) and complete data who had underwent surgery for CRC were analyze to measure individual's survival rate more than 5 years using a machine learning based on penalized Cox regression. We conducted thorough calculations to measure the individual's survival rate more than 5 years for performance evaluation. The receiver operating characteristic curves for the LASSO penalized model, the SCAD penalized model, the unpenalized model, and the RSF model were analyzed. The least absolute shrinkage and selection operator penalized model displayed a mean AUC of 0.67 ± 0.06, the smoothly clipped absolute deviation penalized model exhibited a mean AUC of 0.65 ± 0.07, the unpenalized model showed a mean AUC of 0.64 ± 0.09. Notably, the random survival forests model outperformed the others, demonstrating the most favorable performance evaluation with a mean AUC of 0.71 ± 0.05. Compared to the conventional unpenalized Cox model, recent machine learning techniques (LASSO, SCAD, RSF) showed advantages for data interpretation.
本研究旨在评估机器学习算法在预测结直肠癌患者 5 年以上生存率方面的性能。对 475 名接受结直肠癌手术且数据完整的结直肠癌患者进行分析,以基于惩罚 Cox 回归的机器学习方法测量患者 5 年以上的个体生存率。我们进行了详细的计算,以测量个体 5 年以上的生存率,用于性能评估。分析了 LASSO 惩罚模型、SCAD 惩罚模型、非惩罚模型和 RSF 模型的接收器工作特征曲线。最小绝对收缩和选择算子惩罚模型的平均 AUC 为 0.67±0.06,光滑剪辑绝对偏差惩罚模型的平均 AUC 为 0.65±0.07,非惩罚模型的平均 AUC 为 0.64±0.09。值得注意的是,随机生存森林模型表现优于其他模型,其平均 AUC 为 0.71±0.05,表现出最佳的性能评估。与传统的无惩罚 Cox 模型相比,最近的机器学习技术(LASSO、SCAD、RSF)在数据解释方面具有优势。