Hwangbo Suhyun, Kim Se Ik, Kim Ju-Hyun, Eoh Kyung Jin, Lee Chanhee, Kim Young Tae, Suh Dae-Shik, Park Taesung, Song Yong Sang
Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea.
Department of Obstetrics and Gynecology, Seoul National University College of Medicine, Seoul 03080, Korea.
Cancers (Basel). 2021 Apr 14;13(8):1875. doi: 10.3390/cancers13081875.
To support the implementation of individualized disease management, we aimed to develop machine learning models predicting platinum sensitivity in patients with high-grade serous ovarian carcinoma (HGSOC). We reviewed the medical records of 1002 eligible patients. Patients' clinicopathologic characteristics, surgical findings, details of chemotherapy, treatment response, and survival outcomes were collected. Using the stepwise selection method, based on the area under the receiver operating characteristic curve (AUC) values, six variables associated with platinum sensitivity were selected: age, initial serum CA-125 levels, neoadjuvant chemotherapy, pelvic lymph node status, involvement of pelvic tissue other than the uterus and tubes, and involvement of the small bowel and mesentery. Based on these variables, predictive models were constructed using four machine learning algorithms, logistic regression (LR), random forest, support vector machine, and deep neural network; the model performance was evaluated with the five-fold cross-validation method. The LR-based model performed best at identifying platinum-resistant cases with an AUC of 0.741. Adding the FIGO stage and residual tumor size after debulking surgery did not improve model performance. Based on the six-variable LR model, we also developed a web-based nomogram. The presented models may be useful in clinical practice and research.
为支持个体化疾病管理的实施,我们旨在开发机器学习模型来预测高级别浆液性卵巢癌(HGSOC)患者的铂敏感性。我们回顾了1002例符合条件患者的病历。收集了患者的临床病理特征、手术发现、化疗细节、治疗反应和生存结果。采用逐步选择法,根据受试者工作特征曲线(AUC)值,选择了六个与铂敏感性相关的变量:年龄、初始血清CA-125水平、新辅助化疗、盆腔淋巴结状态、子宫和输卵管以外盆腔组织受累情况以及小肠和肠系膜受累情况。基于这些变量,使用四种机器学习算法构建预测模型,即逻辑回归(LR)、随机森林、支持向量机和深度神经网络;采用五折交叉验证法评估模型性能。基于LR的模型在识别铂耐药病例方面表现最佳,AUC为0.741。加入国际妇产科联盟(FIGO)分期和减瘤手术后的残留肿瘤大小并没有提高模型性能。基于六变量LR模型,我们还开发了一个基于网络的列线图。所提出的模型可能在临床实践和研究中有用。