Suppr超能文献

基于全基因分型高危型人乳头瘤病毒检测(SMART-HPV)的宫颈癌筛查风险分层与管理机器学习模型的开发、验证及临床应用:一项建模研究

Development, validation, and clinical application of a machine learning model for risk stratification and management of cervical cancer screening based on full-genotyping hrHPV test (SMART-HPV): a modelling study.

作者信息

Dong Binhua, Lu Zhen, Yang Tianjie, Wang Junfeng, Zhang Yan, Tuo Xunyuan, Wang Juntao, Lin Shaomei, Cai Hongning, Cheng Huan, Cao Xiaoli, Huang Xinxin, Zheng Zheng, Miao Chong, Wang Yue, Xue Huifeng, Xu Shuxia, Liu Xianhua, Zou Huachun, Sun Pengming

机构信息

Department of Gynecology, Fujian Key Laboratory of Women and Children's Critical Diseases Research, Fujian Maternity and Child Health Hospital, College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, Fujian, China.

Fujian Clinical Research Center for Gynecological Oncology, Fuzhou, Fujian, China.

出版信息

Lancet Reg Health West Pac. 2025 Jan 25;55:101480. doi: 10.1016/j.lanwpc.2025.101480. eCollection 2025 Feb.

Abstract

BACKGROUND

High-risk human papillomavirus (hrHPV) full genotyping facilitates risk stratification and efficiency in cervical cancer screening, widely verified and adopted in various screening settings. We aimed develop a cervical cancer predictive model that can guide referrals for colposcopy using hrHPV full genotyping data in a setting where screening rate is low.

METHODS

We developed, compared and validated four machine learning models (eXtreme gradient boosting [XGBoost], support vector machine [SVM], random forest [RF], and naïve bayes [NB]) for cervical cancer prediction, using data from a national cervical cancer screening project conducted in 267 healthcare centers in China. Cervical intraepithelial neoplasia grade 2 or worse (CIN2+) and CIN3+ were the primary and secondary outcomes. In various screening settings across China, the performance of discrimination was evaluated using area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, area under the precision-recall curve (AUPRC), and accuracy. Calibration and clinical utility were assessed with brier score, calibration curve and decision curve analysis (DCA).

FINDINGS

1,112,846 women were recruited, of whom 599,043 were included in the analysis based on hrHPV full genotyping. Of these, 254,434 (age [years, median, IQR]: 48, 42-54), 297,479 (49, 43-55), 38,500 (37, 32-44), 1950 (38, 33-46), 1590 (53, 47-58), 779 (38, 31-49) and 4311 (40, 33-50) were in the development, temporal validation and external validation 1-5 datasets, respectively. The final simplified clinical risk prediction model includes hrHPV, number of HPV genotypes, cervical cytology, HPV16, HPV18, age, HPV52, HPV39 and gynecological examination. The final optimal XGBoost model for predicting CIN2+ showed good discrimination (AUROC, maximum 0.989 [0.987-0.992]; minimum 0.781 [0.74-0.819]), and calibration (brier score, maximum 0.118 [0.099-0.137]) in the five external validation sets. DCA showed that when the clinical decision threshold probability for optimal XGBoost model was less than 0.80, the model for predicting CIN2+ provided a superior standardized net benefit. The optimal XGBoost model obtained similar results in predicting CIN3+.

INTERPRETATION

We developed a cervical cancer screening risk prediction model that employs hrHPV full genotyping and simple test results to achieve risk prediction and stratified management for colposcopy referrals. This predictive tool is particularly suitable for settings with low screening rates.

FUNDING

National Natural Science Foundation of China; Major Scientific Research Program for Young and Middle-aged Health Professionals of Fujian Province, China; Fujian Province Central Government-Guided Local Science and Technology Development Project; Fujian Province's Third Batch of Flexible Introduction of High-Level Medical Talent Teams; Fujian Provincial Natural Science Foundation of China; Fujian Provincial Science and Technology Innovation Joint Fund.

摘要

背景

高危型人乳头瘤病毒(hrHPV)全基因分型有助于宫颈癌筛查中的风险分层和提高效率,已在各种筛查环境中得到广泛验证和应用。我们旨在开发一种宫颈癌预测模型,该模型可以在筛查率较低的环境中,利用hrHPV全基因分型数据指导阴道镜检查转诊。

方法

我们使用在中国267个医疗中心开展的一项全国宫颈癌筛查项目的数据,开发、比较并验证了四种用于宫颈癌预测的机器学习模型(极端梯度提升[XGBoost]、支持向量机[SVM]、随机森林[RF]和朴素贝叶斯[NB])。宫颈上皮内瘤变2级或更严重(CIN2+)和CIN3+为主要和次要结局。在中国各地的不同筛查环境中,使用受试者工作特征曲线下面积(AUROC)、灵敏度、特异度、精确召回率曲线下面积(AUPRC)和准确度评估判别性能。用Brier评分、校准曲线和决策曲线分析(DCA)评估校准和临床效用。

结果

共招募了1,112,846名女性,其中599,043名基于hrHPV全基因分型纳入分析。其中,254,434名(年龄[岁,中位数,四分位距]:48,42 - 54)、297,479名(49,43 - 55)、38,500名(37,32 - 44)、1950名(38,33 - 46)、1590名(53,47 - 58)、779名(38,31 - 49)和4311名(40,33 - 50)分别纳入开发、时间验证和外部验证1 - 5数据集。最终简化的临床风险预测模型包括hrHPV、HPV基因型数量、宫颈细胞学检查、HPV16、HPV18、年龄、HPV52、HPV39和妇科检查。用于预测CIN2+的最终最优XGBoost模型在五个外部验证集中显示出良好的判别性能(AUROC,最大值0.989[0.987 - 0.992];最小值0.781[0.74 - 0.819])和校准(Brier评分,最大值0.118[0.099 - 0.137])。DCA表明,当最优XGBoost模型的临床决策阈值概率小于0.80时,预测CIN2+的模型提供了更高的标准化净效益。最优XGBoost模型在预测CIN3+时获得了类似的结果。

解读

我们开发了一种宫颈癌筛查风险预测模型,该模型采用hrHPV全基因分型和简单的检测结果来实现风险预测和阴道镜检查转诊的分层管理。这种预测工具特别适用于筛查率较低的环境。

资助

中国国家自然科学基金;中国福建省中青年卫生专业人员重大科研项目;福建省中央引导地方科技发展项目;福建省第三批柔性引进高层次医学人才团队;中国福建省自然科学基金;福建省科技创新联合基金。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad29/11802380/2c30b012bd28/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验