Zeng Da-Tong, Li Ming-Jie, Lin Rui, Huang Wei-Jian, Li Shi-De, Huang Wan-Ying, Li Bin, Li Qi, Chen Gang, Jiang Jia-Shu
Department of Pathology, The First Affiliated Hospital of Guangxi Medical University, Nanning 530021, Guangxi Zhuang Autonomous Region, China.
Department of Pathology, Red Cross Hospital of Yulin, Yulin 537000, Guangxi Zhuang Autonomous Region, China.
World J Clin Oncol. 2025 Aug 24;16(8):107306. doi: 10.5306/wjco.v16.i8.107306.
Ki-67 is a routine test item in clinical pathology departments. However, its prognostic value requires further investigation, especially in the context of research using machine learning (ML), which remains relatively underdeveloped.
To investigate the prognostic value of Ki-67 in cases of colorectal carcinoma (CRC) and explore the potential application of ML algorithms to predict the Ki-67 index.
Case data and pathological sections from two centers were systematically collected. To analyze the prognostic value of the Ki-67 index in CRC, multiple cutoff values were established. Meanwhile, by virtue of the histological features presented in the hematoxylin and eosin-stained CRC images, three mainstream ML algorithms, support vector machine (SVM), random forest (RF), and eXtreme gradient boosting (XGBoost) were employed to construct prediction models. Subsequently, the potential of these algorithms to classify and predict the Ki-67 index was explored.
Non-parametric tests revealed that Ki-67 ≥ 40% correlated with a high histological grade ( = 0.017), deficient mismatch repair protein status associated with ≥ 50%-90% cutoffs (all ≤ 0.028), and ≥ 80% linked to lymph node metastasis ( = 0.006). Kaplan-Meier analysis showed that Ki-67 ≥ 50% predicted higher survival (log-rank = 0.0299, hazard ratio = 2.142), with no differences for other cutoffs. COX regression identified the Ki-67 positive rate as a significant predictor ( = 0.027, hazard ratio = 2.583), while other variables had no association. In algorithmic model predictions, the SVM, RF, and XGBoost models achieved training area under the curve (AUC) values of 0.851, 0.948, and 0.872, respectively, with corresponding test set AUC values of 0.795, 0.755, and 0.750, respectively. During external validation, their AUC values for predicting Ki-67 status reached 0.757, 0.749, and 0.783, respectively.
In algorithmic model predictions, the SVM, RF, and XGBoost models achieved training AUC values of 0.851, 0.948, and 0.872, respectively, with corresponding test set AUC values of 0.795, 0.755, and 0.750, respectively. During external validation, their AUC values for predicting Ki-67 status reached 0.757, 0.749, and 0.783, respectively.
Ki-67是临床病理科的常规检测项目。然而,其预后价值仍需进一步研究,尤其是在机器学习(ML)研究背景下,这方面研究仍相对不足。
探讨Ki-67在结直肠癌(CRC)病例中的预后价值,并探索ML算法预测Ki-67指数的潜在应用。
系统收集了两个中心的病例数据和病理切片。为分析Ki-67指数在CRC中的预后价值,设定了多个临界值。同时,根据苏木精-伊红染色的CRC图像呈现的组织学特征,采用三种主流ML算法,即支持向量机(SVM)、随机森林(RF)和极端梯度提升(XGBoost)构建预测模型。随后,探讨这些算法对Ki-67指数进行分类和预测的潜力。
非参数检验显示,Ki-67≥40%与高组织学分级相关(P = 0.017),错配修复蛋白缺陷状态与≥50%-90%的临界值相关(均P≤0.028),≥80%与淋巴结转移相关(P = 0.006)。Kaplan-Meier分析表明,Ki-67≥50%预测生存率更高(对数秩检验P = 0.0299,风险比 = 2.142),其他临界值无差异。COX回归确定Ki-67阳性率为显著预测因素(P = 0.027,风险比 = 2.583),而其他变量无关联。在算法模型预测中,SVM、RF和XGBoost模型的训练曲线下面积(AUC)值分别为0.851、0.948和0.872,相应测试集AUC值分别为0.795、0.755和0.750。在外部验证中,它们预测Ki-67状态的AUC值分别达到0.757、0.749和0.783。
在算法模型预测中,SVM、RF和XGBoost模型的训练AUC值分别为0.851、0.948和0.872,相应测试集AUC值分别为0.795、0.755和0.750。在外部验证中,它们预测Ki-67状态的AUC值分别达到0.757、0.749和0.783。