Wang Wenwen, Xu Yang, Yuan Suzhen, Li Zhiying, Zhu Xin, Zhou Qin, Shen Wenfeng, Wang Shixuan
Department of Obstetrics and Gynecology, Tongji Medical College, Tongji Hospital, Huazhong University of Science and Technology, Wuhan, China.
School of Computer Engineering and Science, Shanghai University, Shanghai, China.
Front Med (Lausanne). 2022 Mar 4;9:851890. doi: 10.3389/fmed.2022.851890. eCollection 2022.
Endometrial carcinoma (EC) is a common cause of cancer death in women, and having an early accurate prediction model to identify this disease is crucial. The aim of this study was to develop a new machine learning (ML) model-based diagnostic prediction model for EC. We collected data from consecutive patients between November 2012 and January 2021 at tertiary hospitals in central China. Inclusion criteria included women undergoing endometrial biopsy, dilation and curettage, or hysterectomy. A total of 9 features, including patient demographics, vital signs, and laboratory and ultrasound results, were selected in the final analysis. This new model was combined with three top optimal ML methods, namely, logistic regression, gradient-boosted decision tree, and random forest. A total of 1,922 patients were eligible for final analysis and modeling. The ensemble model, called TJHPEC, was validated in an internal validation cohort and two external validation cohorts. The results showed that the AUC values were 0.9346, 0.8341, and 0.8649 for the prediction of total EC and 0.9347, 0.8073, and 0.871 for prediction of stage I EC. Nine clinical features were confirmed to be highly related to the prediction of EC in TJHPEC. In conclusion, our new model may be accurate for identifying EC, especially in the early stage, in the general population of central China.
子宫内膜癌(EC)是女性癌症死亡的常见原因,拥有一个早期准确的预测模型来识别这种疾病至关重要。本研究的目的是开发一种基于机器学习(ML)模型的新型EC诊断预测模型。我们收集了2012年11月至2021年1月期间中国中部三级医院连续就诊患者的数据。纳入标准包括接受子宫内膜活检、刮宫术或子宫切除术的女性。最终分析共选择了9个特征,包括患者人口统计学特征、生命体征以及实验室和超声检查结果。这个新模型与三种顶级优化ML方法相结合,即逻辑回归、梯度提升决策树和随机森林。共有1922例患者符合最终分析和建模的条件。名为TJHPEC的集成模型在一个内部验证队列和两个外部验证队列中进行了验证。结果显示,预测总EC时的AUC值分别为0.9346、0.8341和0.8649,预测I期EC时的AUC值分别为0.9347、0.8073和0.871。在TJHPEC中,9个临床特征被证实与EC的预测高度相关。总之,我们的新模型在识别中国中部普通人群中的EC方面可能是准确的,尤其是在早期阶段。