• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于数据驱动的宫颈癌预测模型,包含异常值检测和过采样方法。

Data-Driven Cervical Cancer Prediction Model with Outlier Detection and Over-Sampling Methods.

机构信息

Department of Industrial and Systems Engineering, Dongguk University-Seoul, Seoul 04620, Korea.

Department of Software, Sejong University, Seoul 05006, Korea.

出版信息

Sensors (Basel). 2020 May 15;20(10):2809. doi: 10.3390/s20102809.

DOI:10.3390/s20102809
PMID:32429090
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7284557/
Abstract

Globally, cervical cancer remains as the foremost prevailing cancer in females. Hence, it is necessary to distinguish the importance of risk factors of cervical cancer to classify potential patients. The present work proposes a cervical cancer prediction model (CCPM) that offers early prediction of cervical cancer using risk factors as inputs. The CCPM first removes outliers by using outlier detection methods such as density-based spatial clustering of applications with noise (DBSCAN) and isolation forest (iForest) and by increasing the number of cases in the dataset in a balanced way, for example, through synthetic minority over-sampling technique (SMOTE) and SMOTE with Tomek link (SMOTETomek). Finally, it employs random forest (RF) as a classifier. Thus, CCPM lies on four scenarios: (1) DBSCAN + SMOTETomek + RF, (2) DBSCAN + SMOTE+ RF, (3) iForest + SMOTETomek + RF, and (4) iForest + SMOTE + RF. A dataset of 858 potential patients was used to validate the performance of the proposed method. We found that combinations of iForest with SMOTE and iForest with SMOTETomek provided better performances than those of DBSCAN with SMOTE and DBSCAN with SMOTETomek. We also observed that RF performed the best among several popular machine learning classifiers. Furthermore, the proposed CCPM showed better accuracy than previously proposed methods for forecasting cervical cancer. In addition, a mobile application that can collect cervical cancer risk factors data and provides results from CCPM is developed for instant and proper action at the initial stage of cervical cancer.

摘要

全球范围内,宫颈癌仍然是女性中最普遍的癌症。因此,有必要区分宫颈癌的危险因素的重要性,以对潜在患者进行分类。本研究提出了一种宫颈癌预测模型(CCPM),该模型使用危险因素作为输入来进行宫颈癌的早期预测。CCPM 首先使用异常值检测方法(如基于密度的空间聚类应用噪声(DBSCAN)和隔离森林(iForest))和以平衡方式增加数据集的案例数,例如通过合成少数过采样技术(SMOTE)和带 Tomak 链接的 SMOTE(SMOTETomek)来去除异常值。最后,它采用随机森林(RF)作为分类器。因此,CCPM 基于四个方案:(1)DBSCAN+SMOTETomek+RF,(2)DBSCAN+SMOTE+RF,(3)iForest+SMOTETomek+RF,和(4)iForest+SMOTE+RF。使用 858 名潜在患者的数据集来验证所提出方法的性能。我们发现,iForest 与 SMOTE 的组合和 iForest 与 SMOTETomek 的组合提供了比 DBSCAN 与 SMOTE 的组合和 DBSCAN 与 SMOTETomek 的组合更好的性能。我们还观察到,RF 在几种流行的机器学习分类器中表现最好。此外,所提出的 CCPM 显示出比以前提出的用于预测宫颈癌的方法更高的准确性。此外,还开发了一个移动应用程序,可以收集宫颈癌危险因素数据,并从 CCPM 提供结果,以便在宫颈癌的初始阶段立即采取适当的行动。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c755/7284557/0ded69f5d83d/sensors-20-02809-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c755/7284557/8b03498db005/sensors-20-02809-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c755/7284557/4ef5a2b461f5/sensors-20-02809-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c755/7284557/4a989c3de015/sensors-20-02809-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c755/7284557/0ded69f5d83d/sensors-20-02809-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c755/7284557/8b03498db005/sensors-20-02809-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c755/7284557/4ef5a2b461f5/sensors-20-02809-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c755/7284557/4a989c3de015/sensors-20-02809-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c755/7284557/0ded69f5d83d/sensors-20-02809-g004.jpg

相似文献

1
Data-Driven Cervical Cancer Prediction Model with Outlier Detection and Over-Sampling Methods.基于数据驱动的宫颈癌预测模型,包含异常值检测和过采样方法。
Sensors (Basel). 2020 May 15;20(10):2809. doi: 10.3390/s20102809.
2
Joint modeling strategy for using electronic medical records data to build machine learning models: an example of intracerebral hemorrhage.利用电子病历数据构建机器学习模型的联合建模策略:以脑出血为例。
BMC Med Inform Decis Mak. 2022 Oct 25;22(1):278. doi: 10.1186/s12911-022-02018-x.
3
Cervical Cancer Identification with Synthetic Minority Oversampling Technique and PCA Analysis using Random Forest Classifier.基于随机森林分类器的合成少数过采样技术和 PCA 分析对宫颈癌的识别。
J Med Syst. 2019 Jul 17;43(9):286. doi: 10.1007/s10916-019-1402-6.
4
Stroke Prediction with Machine Learning Methods among Older Chinese.基于机器学习方法对中国老年人进行中风预测。
Int J Environ Res Public Health. 2020 Mar 12;17(6):1828. doi: 10.3390/ijerph17061828.
5
Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry ford exercIse testing (FIT) project.使用健身数据比较机器学习技术预测全因死亡率:亨利福特锻炼测试(FIT)项目。
BMC Med Inform Decis Mak. 2017 Dec 19;17(1):174. doi: 10.1186/s12911-017-0566-6.
6
A cluster-based ensemble approach for congenital heart disease prediction.基于聚类的先天性心脏病预测集成方法。
Comput Methods Programs Biomed. 2024 Jan;243:107922. doi: 10.1016/j.cmpb.2023.107922. Epub 2023 Nov 7.
7
Outlier-SMOTE: A refined oversampling technique for improved detection of COVID-19.异常值合成少数过采样技术(Outlier-SMOTE):一种用于改进新冠病毒(COVID-19)检测的精细过采样技术。
Intell Based Med. 2020 Dec;3:100023. doi: 10.1016/j.ibmed.2020.100023. Epub 2020 Dec 3.
8
A hybrid Stacking-SMOTE model for optimizing the prediction of autistic genes.一种混合的堆叠-SMOTE 模型,用于优化自闭症基因预测。
BMC Bioinformatics. 2023 Oct 6;24(1):379. doi: 10.1186/s12859-023-05501-y.
9
Improving Prediction of Cervical Cancer Using KNN Imputed SMOTE Features and Multi-Model Ensemble Learning Approach.使用K近邻插补合成少数过采样技术特征和多模型集成学习方法改善宫颈癌预测
Cancers (Basel). 2023 Sep 4;15(17):4412. doi: 10.3390/cancers15174412.
10
Structure-activity relationship-based chemical classification of highly imbalanced Tox21 datasets.基于结构-活性关系的高度不平衡Tox21数据集的化学分类
J Cheminform. 2020 Oct 27;12(1):66. doi: 10.1186/s13321-020-00468-x.

引用本文的文献

1
A hybrid compound scaling hypergraph neural network for robust cervical cancer subtype classification using whole slide cytology images.一种用于使用全玻片细胞学图像进行稳健宫颈癌亚型分类的混合复合缩放超图神经网络。
Sci Rep. 2025 Jul 1;15(1):22201. doi: 10.1038/s41598-025-05891-4.
2
Machine and Deep Learning for the Diagnosis, Prognosis, and Treatment of Cervical Cancer: A Scoping Review.用于宫颈癌诊断、预后和治疗的机器学习与深度学习:一项范围综述
Diagnostics (Basel). 2025 Jun 17;15(12):1543. doi: 10.3390/diagnostics15121543.
3
Cervical Cancer Detection Using Deep Neural Network and Hybrid Waterwheel Plant Optimization Algorithm.

本文引用的文献

1
Detection of Insulin Pump Malfunctioning to Improve Safety in Artificial Pancreas Using Unsupervised Algorithms.使用无监督算法检测胰岛素泵故障以提高人工胰腺的安全性
J Diabetes Sci Technol. 2019 Nov;13(6):1065-1076. doi: 10.1177/1932296819881452. Epub 2019 Oct 14.
2
Key feature selection and risk prediction for lane-changing behaviors based on vehicles' trajectory data.基于车辆轨迹数据的变道行为关键特征选择与风险预测。
Accid Anal Prev. 2019 Aug;129:156-169. doi: 10.1016/j.aap.2019.05.017. Epub 2019 May 28.
3
LC/MS-Based Polar Metabolite Profiling Identified Unique Biomarker Signatures for Cervical Cancer and Cervical Intraepithelial Neoplasia Using Global and Targeted Metabolomics.
基于深度神经网络和混合水车植物优化算法的宫颈癌检测
Bioengineering (Basel). 2025 Apr 30;12(5):478. doi: 10.3390/bioengineering12050478.
4
A retrospective study using machine learning to develop predictive model to identify rotavirus-associated acute gastroenteritis in children.一项使用机器学习开发预测模型以识别儿童轮状病毒相关性急性胃肠炎的回顾性研究。
PeerJ. 2025 Apr 14;13:e19025. doi: 10.7717/peerj.19025. eCollection 2025.
5
Role of AI in empowering and redefining the oncology care landscape: perspective from a developing nation.人工智能在赋能和重新定义肿瘤护理格局中的作用:来自一个发展中国家的视角。
Front Digit Health. 2025 Mar 4;7:1550407. doi: 10.3389/fdgth.2025.1550407. eCollection 2025.
6
Colony Site Selection of Gray Heron () During the Breeding Period at Multiple Spatial Scales.繁殖期白鹭在多个空间尺度上的栖息地选择
Ecol Evol. 2025 Feb 12;15(2):e70937. doi: 10.1002/ece3.70937. eCollection 2025 Feb.
7
Bidirectional recurrent neural network approach for predicting cervical cancer recurrence and survival.用于预测宫颈癌复发和生存的双向递归神经网络方法
Sci Rep. 2024 Dec 30;14(1):31641. doi: 10.1038/s41598-024-80472-5.
8
A bibliometric review of predictive modelling for cervical cancer risk.宫颈癌风险预测模型的文献计量学综述
Front Res Metr Anal. 2024 Nov 19;9:1493944. doi: 10.3389/frma.2024.1493944. eCollection 2024.
9
Edge computing-based ensemble learning model for health care decision systems.基于边缘计算的医疗决策系统集成学习模型。
Sci Rep. 2024 Nov 6;14(1):26997. doi: 10.1038/s41598-024-78225-5.
10
Construction and evaluation of a liver cancer risk prediction model based on machine learning.基于机器学习的肝癌风险预测模型的构建与评估
World J Gastrointest Oncol. 2024 Sep 15;16(9):3839-3850. doi: 10.4251/wjgo.v16.i9.3839.
基于液相色谱/质谱联用的极性代谢物谱分析,利用全局和靶向代谢组学确定了宫颈癌和宫颈上皮内瘤变的独特生物标志物特征。
Cancers (Basel). 2019 Apr 10;11(4):511. doi: 10.3390/cancers11040511.
4
Validation of miRNAs as Breast Cancer Biomarkers with a Machine Learning Approach.用机器学习方法验证微小RNA作为乳腺癌生物标志物
Cancers (Basel). 2019 Mar 26;11(3):431. doi: 10.3390/cancers11030431.
5
Server-Focused Security Assessment of Mobile Health Apps for Popular Mobile Platforms.针对流行移动平台的移动健康应用程序的服务器端安全评估
J Med Internet Res. 2019 Jan 23;21(1):e9818. doi: 10.2196/jmir.9818.
6
Association of human papillomavirus and bacterial vaginosis with increased risk of high-grade squamous intraepithelial cervical lesions.人乳头瘤病毒和细菌性阴道病与高级别宫颈鳞状上皮内病变风险增加的关联。
Int J Gynecol Cancer. 2019 Feb;29(2):242-249. doi: 10.1136/ijgc-2018-000076. Epub 2019 Jan 10.
7
DDC-Outlier: Preventing Medication Errors Using Unsupervised Learning.DDC 离群值:使用无监督学习预防用药错误。
IEEE J Biomed Health Inform. 2019 Mar;23(2):874-881. doi: 10.1109/JBHI.2018.2828028. Epub 2018 Apr 17.
8
Role of in cervical cancer.(此处原文不完整,“Role of”后面缺少具体内容,无法准确完整翻译,暂且只能翻译为)在宫颈癌中的作用。
Cancer Manag Res. 2018 May 16;10:1219-1229. doi: 10.2147/CMAR.S165228. eCollection 2018.
9
The European Society of Gynaecological Oncology/European Society for Radiotherapy and Oncology/European Society of Pathology Guidelines for the Management of Patients With Cervical Cancer.《欧洲妇科肿瘤学会/欧洲放射肿瘤学会/欧洲病理学会宫颈癌管理指南》。
Int J Gynecol Cancer. 2018 May;28(4):641-655. doi: 10.1097/IGC.0000000000001216.
10
Cytology and high risk HPV testing in cervical cancer screening program: Outcome of 3-year follow-up in an academic institute.宫颈癌筛查项目中的细胞学和高危型人乳头瘤病毒检测:一所学术机构的3年随访结果
Diagn Cytopathol. 2018 Jan;46(1):22-27. doi: 10.1002/dc.23843. Epub 2017 Oct 19.