Piedimonte Sabrina, Feigenberg Tomer, Drysdale Erik, Kwon Janice, Gotlieb Walter H, Cormier Beatrice, Plante Marie, Lau Susie, Helpman Limor, Renaud Marie-Claude, May Taymaa, Vicus Danielle
Division of Gynecologic Oncology, University of Toronto, Toronto, Ontario, Canada.
Trillium Health Partners, Missassauga, Toronto, Ontario, Canada.
J Surg Oncol. 2022 Nov;126(6):1096-1103. doi: 10.1002/jso.27008. Epub 2022 Jul 12.
To develop machine-learning models to predict recurrence and time-to-recurrence in high-grade endometrial cancer (HGEC) following surgery and tailored adjuvant treatment.
Data were retrospectively collected across eight Canadian centers including 1237 patients. Four models were trained to predict recurrence: random forests, boosted trees, and two neural networks. Receiver operating characteristic curves were used to select the best model based on the highest area under the curve (AUC). For time to recurrence, we compared random forests and Least Absolute Shrinkage and Selection Operator (LASSO) model to Cox proportional hazards.
The random forest was the best model to predict recurrence in HGEC; the AUCs were 85.2%, 74.1%, and 71.8% in the training, validation, and test sets, respectively. The top five predictors were: stage, uterus height, specimen weight, adjuvant chemotherapy, and preoperative histology. Performance increased to 77% and 80% when stratified by Stage III and IV, respectively. For time to recurrence, there was no difference between the LASSO and Cox proportional hazards models (c-index 71%). The random forest had a c-index of 60.5%.
A bootstrap random forest model may be a more accurate technique to predict recurrence in HGEC using multiple clinicopathologic factors. For time to recurrence, machine-learning methods performed similarly to the Cox proportional hazards model.
开发机器学习模型,以预测高级别子宫内膜癌(HGEC)术后及个体化辅助治疗后的复发情况和复发时间。
回顾性收集了加拿大8个中心的1237例患者的数据。训练了4种模型来预测复发情况:随机森林模型、梯度提升树模型和两种神经网络模型。采用受试者工作特征曲线,根据曲线下面积(AUC)最大来选择最佳模型。对于复发时间,我们将随机森林模型和最小绝对收缩和选择算子(LASSO)模型与Cox比例风险模型进行了比较。
随机森林模型是预测HGEC复发的最佳模型;在训练集、验证集和测试集中,AUC分别为85.2%、74.1%和71.8%。前5个预测因素为:分期、子宫高度、标本重量、辅助化疗和术前组织学类型。按Ⅲ期和Ⅳ期分层时,预测性能分别提高到77%和80%。对于复发时间,LASSO模型和Cox比例风险模型之间没有差异(c指数为71%)。随机森林模型的c指数为60.5%。
采用自抽样随机森林模型,利用多种临床病理因素预测HGEC复发可能是一种更准确的技术。对于复发时间,机器学习方法的表现与Cox比例风险模型相似。