Chen Po-Chuan, Yeh Yu-Min, Lin Bo-Wen, Chan Ren-Hao, Su Pei-Fang, Liu Yi-Chia, Lee Chung-Ta, Chen Shang-Hung, Lin Peng-Chan
Department of Surgery, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 704, Taiwan.
Department of Oncology, National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan 704, Taiwan.
Biomedicines. 2022 Feb 1;10(2):340. doi: 10.3390/biomedicines10020340.
Colorectal cancer (CRC) is one of the most prevalent malignant diseases worldwide. Risk prediction for tumor recurrence is important for making effective treatment decisions and for the survival outcomes of patients with CRC after surgery. Herein, we aimed to explore a prediction algorithm and the risk factors for postoperative tumor recurrence using a machine learning (ML) approach with standardized pathology reports for patients with stage II and III CRC.
Pertinent clinicopathological features were compiled from medical records and standardized pathology reports of patients with stage II and III CRC. Four ML models based on logistic regression (LR), random forest (RF), classification and regression decision trees (CARTs), and support vector machine (SVM) were applied for the development of the prediction algorithm. The area under the curve (AUC) of the ML models was determined in order to compare the prediction accuracy. Genomic studies were performed using a panel-targeted next-generation sequencing approach.
A total of 1073 patients who received curative intent surgery at the National Cheng Kung University Hospital between January 2004 and January 2019 were included. Based on conventional statistical methods, chemotherapy ( = 0.003), endophytic tumor configuration ( = 0.008), TNM stage III disease ( < 0.001), pT4 ( < 0.001), pN2 ( < 0.001), increased numbers of lymph node metastases ( < 0.001), higher lymph node ratios (LNR) ( < 0.001), lymphovascular invasion ( < 0.001), perineural invasion ( < 0.001), tumor budding ( = 0.004), and neoadjuvant chemoradiotherapy ( = 0.025) were found to be correlated with the tumor recurrence of patients with stage II-III CRC. While comparing the performance of different ML models for predicting cancer recurrence, the AUCs for LR, RF, CART, and SVM were found to be 0.678, 0.639, 0.593, and 0.581, respectively. The LR model had a better accuracy value of 0.87 and a specificity value of 1 in the testing set. Two prognostic factors, age and LNR, were selected by multivariable analysis and the four ML models. In terms of age, older patients received fewer cycles of chemotherapy and radiotherapy ( < 0.001). Right-sided colon tumors ( = 0.002), larger tumor sizes ( = 0.008) and tumor volumes ( = 0.049), TNM stage II disease ( < 0.001), and advanced pT3-4 stage diseases ( = 0.04) were found to be correlated with the older age of patients. However, pN2 diseases ( = 0.005), lymph node metastasis number ( = 0.001), LNR ( = 0.004), perineural invasion ( = 0.018), and overall survival rate ( < 0.001) were found to be decreased in older patients. Furthermore, and mutations ( = 0.032 and 0.039, respectively) were more frequently found in older patients with stage II-III CRC compared to their younger counterparts.
This study demonstrated that ML models have a comparable predictive power for determining cancer recurrence in patients with stage II-III CRC after surgery. Advanced age and high LNR were significant risk factors for cancer recurrence, as determined by ML algorithms and multivariable analyses. Distinctive genomic profiles may contribute to discrete clinical behaviors and survival outcomes between patients of different age groups. Studies incorporating complete molecular and genomic profiles in cancer prediction models are beneficial for patients with stage II-III CRC.
结直肠癌(CRC)是全球最常见的恶性疾病之一。肿瘤复发的风险预测对于做出有效的治疗决策以及结直肠癌患者术后的生存结果至关重要。在此,我们旨在使用机器学习(ML)方法,结合II期和III期CRC患者的标准化病理报告,探索一种预测算法以及术后肿瘤复发的风险因素。
从II期和III期CRC患者的病历和标准化病理报告中收集相关的临床病理特征。应用基于逻辑回归(LR)、随机森林(RF)、分类与回归决策树(CART)和支持向量机(SVM)的四种ML模型来开发预测算法。确定ML模型的曲线下面积(AUC)以比较预测准确性。使用靶向二代测序方法进行基因组研究。
纳入了2004年1月至2019年1月在国立成功大学医院接受根治性手术的1073例患者。基于传统统计方法,发现化疗(=0.003)、内生性肿瘤形态(=0.008)、TNM III期疾病(<0.001)、pT4(<0.001)、pN2(<0.001)、淋巴结转移数量增加(<0.001)、更高的淋巴结比率(LNR)(<0.001)、淋巴管侵犯(<0.001)、神经周围侵犯(<0.001)、肿瘤芽生(=0.004)和新辅助放化疗(=0.025)与II - III期CRC患者的肿瘤复发相关。在比较不同ML模型预测癌症复发的性能时,发现LR、RF、CART和SVM的AUC分别为0.678、0.639、0.593和0.581。LR模型在测试集中具有更好的准确性值0.87和特异性值1。通过多变量分析和四种ML模型选择了两个预后因素,即年龄和LNR。在年龄方面,老年患者接受的化疗和放疗周期较少(<0.001)。发现右侧结肠癌(=0.002)、更大的肿瘤大小(=0.008)和肿瘤体积(=0.049)、TNM II期疾病(<0.001)以及晚期pT3 - 4期疾病(=0.04)与患者年龄较大相关。然而,发现老年患者中pN2疾病(=0.005)、淋巴结转移数量(=