Suppr超能文献

一种精确的机器学习模型:利用特征选择和可解释人工智能检测宫颈癌。

A precise machine learning model: Detecting cervical cancer using feature selection and explainable AI.

作者信息

Shakil Rashiduzzaman, Islam Sadia, Akter Bonna

机构信息

Department of Computer Science and Engineering, Daffodil International University, Dhaka, Birulia 1216, Bangladesh.

出版信息

J Pathol Inform. 2024 Sep 26;15:100398. doi: 10.1016/j.jpi.2024.100398. eCollection 2024 Dec.

Abstract

Cervical cancer is a cancer that remains a significant global health challenge all over the world. Due to improper screening in the early stages, and healthcare disparities, a large number of women are suffering from this disease, and the mortality rate increases day by day. Hence, in these studies, we presented a precise approach utilizing six different machine learning models (decision tree, logistic regression, naïve bayes, random forest, k nearest neighbors, support vector machine), which can predict the early stage of cervical cancer by analysing 36 risk factor attributes of 858 individuals. In addition, two data balancing techniques-Synthetic Minority Oversampling Technique and Adaptive Synthetic Sampling-were used to mitigate the data imbalance issues. Furthermore, Chi-square and Least Absolute Shrinkage and Selection Operator are two distinct feature selection processes that have been applied to evaluate the feature rank, which are mostly correlated to identify the particular disease, and also integrate an explainable artificial intelligence technique, namely Shapley Additive Explanations, for clarifying the model outcome. The applied machine learning model outcome is evaluated by performance evaluation matrices, namely accuracy, sensitivity, specificity, precision, f1-score, false-positive rate and false-negative rate, and area under the Receiver operating characteristic curve score. The decision tree outperformed in Chi-square feature selection with outstanding accuracy with 97.60%, 98.73% sensitivity, 80% specificity, and 98.73% precision, respectively. During the data imbalance, DT performed 97% accuracy, 99.35% sensitivity, 69.23% specificity, and 97.45% precision. This research is focused on developing diagnostic frameworks with automated tools to improve the detection and management of cervical cancer, as well as on helping healthcare professionals deliver more efficient and personalized care to their patients.

摘要

宫颈癌是一种在全球范围内仍然构成重大健康挑战的癌症。由于早期筛查不当以及医疗保健差异,大量女性正遭受这种疾病的折磨,死亡率也日益上升。因此,在这些研究中,我们提出了一种精确的方法,利用六种不同的机器学习模型(决策树、逻辑回归、朴素贝叶斯、随机森林、k近邻、支持向量机),通过分析858名个体的36个风险因素属性来预测宫颈癌的早期阶段。此外,还使用了两种数据平衡技术——合成少数过采样技术和自适应合成采样——来缓解数据不平衡问题。此外,卡方检验和最小绝对收缩和选择算子是两种不同的特征选择过程,已被用于评估特征排名,这些特征大多与识别特定疾病相关,并且还集成了一种可解释的人工智能技术,即夏普利值加法解释,以阐明模型结果。应用的机器学习模型结果通过性能评估矩阵进行评估,即准确率、灵敏度、特异性、精确率、F1分数、假阳性率和假阴性率,以及受试者工作特征曲线下面积分数。在卡方特征选择中,决策树表现出色,准确率分别为97.60%、灵敏度为98.73%、特异性为80%、精确率为98.73%。在数据不平衡期间,决策树的准确率为97%、灵敏度为99.35%、特异性为69.23%、精确率为97.45%。本研究的重点是开发具有自动化工具的诊断框架,以改善宫颈癌的检测和管理,同时帮助医疗保健专业人员为患者提供更高效、个性化的护理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/244a/11530914/3555f1dc52c5/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验