文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

基于常规血液分析的机器学习模型用于宫颈癌预测。

Cervical cancer prediction using machine learning models based on routine blood analysis.

作者信息

Su Jie, Lu Hui, Zhang Ruihuan, Cui Na, Chen Chao, Si Qin, Song Biao

机构信息

Medical neurobiology laboratory, Inner Mongolia Medical University, Huhhot, 010030, China.

College of Computer Science, Inner Mongolia University, Hohhot, 010021, China.

出版信息

Sci Rep. 2025 Jul 2;15(1):22655. doi: 10.1038/s41598-025-08166-0.


DOI:10.1038/s41598-025-08166-0
PMID:40594680
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12216743/
Abstract

Cervical cancer (CC) is the fourth most common cancer among women globally. The key to preventing and treating CC is early detection, diagnosis, and treatment. This study aimed to develop an interpretable model for predicting CC risk using routine blood data. The primary endpoint variable is the occurrence of CC, as confirmed by histopathological diagnosis. We used the Shapley Additive Explanation (SHAP) method to provide interpretabiligy and identify key factors associated with CC. In this restrospective study, medical records of patients from 2013 to 2023 were collected. A total of 2,503 patients diagnosed with CC were included in the case group, while the control group was composed of 3,794 patients without apparent signs of the disease, which included women with other gynecological conditions as well as healthy individuals undergoing routine check-ups. Age, clinical diagnosis information and 22 blood cell analysis results were considered. Four different algorithms were applied to construct a model for estimating the likelihood of CC occurrence. Using least absolute shrinkage and selection operator (LASSO) and the random forest method (RF) method, 15 key routine blood features were ultimtely selected from an initial set of 23 features for model training. These features include age, red blood cell count (RBC), platelet distribution width (PDW), white blood cell count (WBC), Lymphocyte Percentage (LYMPH%), basophil count (BASO), Basophil Percentage (BASO%), Lymphocyte Absolute Value (LYMPH), Neutrophil Percentage (NEUT%), Hemoglobin (HGB), Mean Corpuscular Hemoglobin Concentration (MCHC), Red Cell Distribution Width (R-CV), Mean Platelet Volume (MPV), Plateletcrit (PCT), and Among the four models, the extreme gradient boosting (XGBoost) model achieved the highest predictive performance, with an area under the curve (AUC) of 0.964. In contrast, the RF model exhibited the poorest generalization ability, with an AUC of 0.907. The SHAP method revealed the top 6 predictors of CC according to the importance ranking, and the average platelet distribution width (PDW) was recognized as the most important predictor variable for CC occurrence (the primary endpoint variable).

摘要

宫颈癌(CC)是全球女性中第四大常见癌症。预防和治疗宫颈癌的关键在于早期检测、诊断和治疗。本研究旨在开发一种可解释的模型,用于利用常规血液数据预测宫颈癌风险。主要终点变量是经组织病理学诊断确诊的宫颈癌的发生情况。我们使用夏普利值附加解释(SHAP)方法来提供可解释性,并识别与宫颈癌相关的关键因素。在这项回顾性研究中,收集了2013年至2023年患者的病历。病例组共纳入2503例诊断为宫颈癌的患者,而对照组由3794例无明显疾病迹象的患者组成,其中包括患有其他妇科疾病的女性以及接受常规检查的健康个体。考虑了年龄、临床诊断信息和22项血细胞分析结果。应用四种不同的算法构建了一个模型,用于估计宫颈癌发生的可能性。使用最小绝对收缩和选择算子(LASSO)和随机森林方法(RF),最终从最初的23个特征集中选择了15个关键的常规血液特征用于模型训练。这些特征包括年龄、红细胞计数(RBC)、血小板分布宽度(PDW)、白细胞计数(WBC)、淋巴细胞百分比(LYMPH%)、嗜碱性粒细胞计数(BASO)、嗜碱性粒细胞百分比(BASO%)、淋巴细胞绝对值(LYMPH)、中性粒细胞百分比(NEUT%)、血红蛋白(HGB)、平均红细胞血红蛋白浓度(MCHC)、红细胞分布宽度(R-CV)、平均血小板体积(MPV)、血小板压积(PCT)。在这四个模型中,极端梯度提升(XGBoost)模型实现了最高的预测性能,曲线下面积(AUC)为0.964。相比之下,RF模型的泛化能力最差,AUC为0.907。SHAP方法根据重要性排名揭示了宫颈癌的前6个预测因子,平均血小板分布宽度(PDW)被认为是宫颈癌发生(主要终点变量)最重要的预测变量。

相似文献

[1]
Cervical cancer prediction using machine learning models based on routine blood analysis.

Sci Rep. 2025-7-2

[2]
Supervised Machine Learning Models for Predicting Sepsis-Associated Liver Injury in Patients With Sepsis: Development and Validation Study Based on a Multicenter Cohort Study.

J Med Internet Res. 2025-5-26

[3]
Interpretable machine learning for predicting isolated basal septal hypertrophy.

PLoS One. 2025-6-30

[4]
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?

Clin Orthop Relat Res. 2024-9-1

[5]
Interpretable XGBoost model identifies idiopathic central precocious puberty in girls using four clinical and imaging features.

BMC Endocr Disord. 2025-7-1

[6]
Construction and validation of HBV-ACLF bacterial infection diagnosis model based on machine learning.

BMC Infect Dis. 2025-7-1

[7]
Development of machine learning model for predicting prolonged operation time in lumbar stenosis undergoing posterior lumbar interbody fusion: a multicenter study.

Spine J. 2025-3

[8]
Predicting Early-Onset Colorectal Cancer in Individuals Below Screening Age Using Machine Learning and Real-World Data: Case Control Study.

JMIR Cancer. 2025-6-19

[9]
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.

Cochrane Database Syst Rev. 2022-5-20

[10]
Serum calcium-based interpretable machine learning model for predicting anastomotic leakage after rectal cancer resection: A multi-center study.

World J Gastroenterol. 2025-5-21

本文引用的文献

[1]
The role of platelets in cancer: from their influence on tumor progression to their potential use in liquid biopsy.

Biomark Res. 2025-2-11

[2]
Development, validation, and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full-genotyping hrHPV test (SMART-HPV): a modelling study.

Lancet Reg Health West Pac. 2025-1-25

[3]
Maternal, delivery and neonatal outcomes in women with cervical cancer. A study of a population database.

Oncoscience. 2025-1-20

[4]
A precise machine learning model: Detecting cervical cancer using feature selection and explainable AI.

J Pathol Inform. 2024-9-26

[5]
Blood cell indices and inflammation-related markers with kidney cancer risk: a large-population prospective analysis in UK Biobank.

Front Oncol. 2024-5-23

[6]
Interpreting artificial intelligence models: a systematic review on the application of LIME and SHAP in Alzheimer's disease detection.

Brain Inform. 2024-4-5

[7]
The dynamic role of platelets in cancer progression and their therapeutic implications.

Nat Rev Cancer. 2024-1

[8]
PREDICTIVE VALUE OF PLATELET COUNT AND PLATELET INDICES IN CERVICAL CANCER PATIENTS WITH EXTERNAL RADIATION THERAPY.

Wiad Lek. 2023

[9]
Recent advancements in machine learning and deep learning-based breast cancer detection using mammograms.

Phys Med. 2023-10

[10]
Artificial Intelligence in Head and Neck Cancer: A Systematic Review of Systematic Reviews.

Adv Ther. 2023-8

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索