Suppr超能文献

迈向基于数据的个体化宫颈癌风险分层系统。

Towards a data-driven system for personalized cervical cancer risk stratification.

机构信息

Department of Research, Cancer Registry of Norway (CRN), Oslo, 0379, Norway.

Department of Registry Informatics, CRN, Oslo, 0379, Norway.

出版信息

Sci Rep. 2022 Jul 15;12(1):12083. doi: 10.1038/s41598-022-16361-6.

Abstract

Mass-screening programs for cervical cancer prevention in the Nordic countries have been effective in reducing cancer incidence and mortality at the population level. Women who have been regularly diagnosed with normal screening exams represent a sub-population with a low risk of disease and distinctive screening strategies which avoid over-screening while identifying those with high-grade lesions are needed to improve the existing one-size-fits-all approach. Machine learning methods for more personalized cervical cancer risk estimation may be of great utility to screening programs shifting to more targeted screening. However, deriving personalized risk prediction models is challenging as effective screening has made cervical cancer rare and the exam results are strongly skewed towards normal. Moreover, changes in female lifestyle and screening habits over time can cause a non-stationary data distribution. In this paper, we treat cervical cancer risk prediction as a longitudinal forecasting problem. We define risk estimators by extending existing frameworks developed on cervical cancer screening data to incremental learning for longitudinal risk predictions and compare these estimators to machine learning methods popular in biomedical applications. As input to the prediction models, we utilize all the available data from the individual screening histories.Using data from the Cancer Registry of Norway, we find in numerical experiments that the models are strongly biased towards normal results due to imbalanced data. To identify females at risk of cancer development, we adapt an imbalanced classification strategy to non-stationary data. Using this strategy, we estimate the absolute risk from longitudinal model predictions and a hold-out set of screening data. Comparing absolute risk curves demonstrate that prediction models can closely reflect the absolute risk observed in the hold-out set. Such models have great potential for improving cervical cancer risk stratification for more personalized screening recommendations.

摘要

北欧国家的宫颈癌预防大规模筛查计划在人群层面上已有效降低了癌症发病率和死亡率。定期接受正常筛查检查的女性代表了一种疾病风险较低的亚人群,需要制定避免过度筛查同时识别高级别病变的独特筛查策略,以改进现有的一刀切方法。用于更个性化宫颈癌风险估计的机器学习方法可能对转向更有针对性筛查的筛查计划非常有用。然而,由于有效的筛查使宫颈癌变得罕见,并且检查结果强烈偏向正常,因此推导个性化风险预测模型具有挑战性。此外,随着时间的推移,女性生活方式和筛查习惯的变化可能导致数据分布非平稳。在本文中,我们将宫颈癌风险预测视为一个纵向预测问题。我们通过将现有的基于宫颈癌筛查数据的框架扩展到用于纵向风险预测的增量学习,来定义风险估计器,并将这些估计器与在生物医学应用中流行的机器学习方法进行比较。作为预测模型的输入,我们利用个体筛查历史中的所有可用数据。

使用来自挪威癌症登记处的数据,我们在数值实验中发现,由于数据不平衡,这些模型对正常结果存在严重偏差。为了识别有癌症发展风险的女性,我们将不平衡分类策略应用于非平稳数据。使用这种策略,我们根据纵向模型预测和一个筛查数据保留集来估计绝对风险。比较绝对风险曲线表明,预测模型可以很好地反映保留集观察到的绝对风险。这种模型在为更个性化的筛查建议进行宫颈癌风险分层方面具有很大的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/69b1/9287371/d18ee694a265/41598_2022_16361_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验