Zhai Furui, Mu Shanshan, Song Yinghui, Zhang Min, Zhang Cui, Lv Ze
Gynecological Clinic, Cangzhou Central Hospital, Cangzhou City, Hebei Province, People's Republic of China.
Cancer Manag Res. 2024 Sep 6;16:1175-1187. doi: 10.2147/CMAR.S484057. eCollection 2024.
This study aims to develop a machine learning (ML) model to predict the risk of residual or recurrent high-grade cervical intraepithelial neoplasia (CIN) after loop electrosurgical excision procedure (LEEP), addressing a critical gap in personalized follow-up care.
A retrospective analysis of 532 patients who underwent LEEP for high-grade CIN at Cangzhou Central Hospital (2016-2020) was conducted. In the final analysis, 99 women (18.6%) were found to have residual or recurrent high-grade CIN (CIN2 or worse) within five years of follow-up. Four feature selection methods identified significant predictors of residual or recurrent CIN. Eight ML algorithms were evaluated using performance metrics such as AUROC, accuracy, sensitivity, specificity, PPV, NPV, F1 score, calibration curve, and decision curve analysis. Fivefold cross-validation optimized and validated the model, and SHAP analysis assessed feature importance.
The XGBoost algorithm demonstrated the highest predictive performance with the best AUROC. The optimized model included six key predictors: age, ThinPrep cytologic test (TCT) results, HPV classification, CIN severity, glandular involvement, and margin status. SHAP analysis identified CIN severity and margin status as the most influential predictors. An online prediction tool was developed for real-time risk assessment.
This ML-based predictive model for post-LEEP high-grade CIN provides a significant advancement in gynecologic oncology, enhancing personalized patient care and facilitating early intervention and informed clinical decision-making.
本研究旨在开发一种机器学习(ML)模型,以预测环形电切术(LEEP)后残留或复发性高级别宫颈上皮内瘤变(CIN)的风险,填补个性化后续护理中的关键空白。
对沧州中心医院(2016 - 2020年)532例因高级别CIN接受LEEP手术的患者进行回顾性分析。在最终分析中,发现99名女性(18.6%)在随访五年内出现残留或复发性高级别CIN(CIN2或更严重)。四种特征选择方法确定了残留或复发性CIN的显著预测因素。使用诸如AUROC、准确性、敏感性、特异性、PPV、NPV、F1分数、校准曲线和决策曲线分析等性能指标评估了八种ML算法。五折交叉验证优化并验证了模型,SHAP分析评估了特征重要性。
XGBoost算法表现出最高的预测性能和最佳的AUROC。优化后的模型包括六个关键预测因素:年龄、薄层液基细胞学检测(TCT)结果、HPV分类、CIN严重程度、腺体受累情况和切缘状态。SHAP分析确定CIN严重程度和切缘状态是最有影响力的预测因素。开发了一个在线预测工具用于实时风险评估。
这种基于ML的LEEP术后高级别CIN预测模型在妇科肿瘤学方面取得了重大进展,增强了个性化患者护理,并促进了早期干预和明智的临床决策。