Department of Health Information Management, Student Research Committee, School of Health Management and Information Sciences Branch, Iran University of Medical Sciences, Tehran, Iran.
Biomed Eng Online. 2024 Feb 12;23(1):18. doi: 10.1186/s12938-024-01219-x.
Ovarian cancer (OC) is a prevalent and aggressive malignancy that poses a significant public health challenge. The lack of preventive strategies for OC increases morbidity, mortality, and other negative consequences. Screening OC through risk prediction could be leveraged as a powerful strategy for preventive purposes that have not received much attention. So, this study aimed to leverage machine learning approaches as predictive assistance solutions to screen high-risk groups of OC and achieve practical preventive purposes.
As this study is data-driven and retrospective in nature, we leveraged 1516 suspicious OC women data from one concentrated database belonging to six clinical settings in Sari City from 2015 to 2019. Six machine learning (ML) algorithms, including XG-Boost, Random Forest (RF), J-48, support vector machine (SVM), K-nearest neighbor (KNN), and artificial neural network (ANN) were leveraged to construct prediction models for OC. To choose the best model for predicting OC, we compared various prediction models built using the area under the receiver characteristic operator curve (AU-ROC).
Current experimental results revealed that the XG-Boost with AU-ROC = 0.93 (0.95 CI = [0.91-0.95]) was recognized as the best-performing model for predicting OC.
ML approaches possess significant predictive efficiency and interoperability to achieve powerful preventive strategies leveraging OC screening high-risk groups.
卵巢癌(OC)是一种常见且具有侵袭性的恶性肿瘤,对公共健康构成重大挑战。由于缺乏 OC 的预防策略,发病率、死亡率和其他负面后果增加。通过风险预测对 OC 进行筛查,可以作为一种具有预防目的的强大策略,但尚未得到太多关注。因此,本研究旨在利用机器学习方法作为预测辅助解决方案,对 OC 的高危人群进行筛查,以实现实际的预防目的。
由于本研究是基于数据和回顾性的,我们利用了 2015 年至 2019 年来自萨里市六个临床环境的一个集中数据库的 1516 名可疑 OC 女性数据。利用六种机器学习(ML)算法,包括 XG-Boost、随机森林(RF)、J-48、支持向量机(SVM)、K-最近邻(KNN)和人工神经网络(ANN),构建了 OC 预测模型。为了选择预测 OC 的最佳模型,我们比较了使用接收器工作特征运算符曲线下面积(AU-ROC)构建的各种预测模型。
目前的实验结果表明,XG-Boost 的 AU-ROC 值为 0.93(0.95 CI=[0.91-0.95]),被认为是预测 OC 的最佳模型。
机器学习方法具有显著的预测效率和互操作性,可以利用 OC 筛查高危人群来实现强大的预防策略。