使用常规临床数据进行风险评估的挑战：以电子健康记录估算宫颈癌风险为例。

Challenges in risk estimation using routinely collected clinical data: The example of estimating cervical cancer risks from electronic health-records.

机构信息

Centre for Cancer Prevention, Wolfson Institute of Preventive Medicine, Barts and the London School of Medicine and Dentistry, Queen Mary, University of London, Charterhouse Square, London EC1M 6BQ, UK.

Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, DHHS, Bethesda, MD, USA.

出版信息

Prev Med. 2018 Jun;111:429-435. doi: 10.1016/j.ypmed.2017.12.004. Epub 2017 Dec 6.

DOI:10.1016/j.ypmed.2017.12.004

PMID:29222045

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5930038/

Abstract

Electronic health-records (EHR) are increasingly used by epidemiologists studying disease following surveillance testing to provide evidence for screening intervals and referral guidelines. Although cost-effective, undiagnosed prevalent disease and interval censoring (in which asymptomatic disease is only observed at the time of testing) raise substantial analytic issues when estimating risk that cannot be addressed using Kaplan-Meier methods. Based on our experience analysing EHR from cervical cancer screening, we previously proposed the logistic-Weibull model to address these issues. Here we demonstrate how the choice of statistical method can impact risk estimates. We use observed data on 41,067 women in the cervical cancer screening program at Kaiser Permanente Northern California, 2003-2013, as well as simulations to evaluate the ability of different methods (Kaplan-Meier, Turnbull, Weibull and logistic-Weibull) to accurately estimate risk within a screening program. Cumulative risk estimates from the statistical methods varied considerably, with the largest differences occurring for prevalent disease risk when baseline disease ascertainment was random but incomplete. Kaplan-Meier underestimated risk at earlier times and overestimated risk at later times in the presence of interval censoring or undiagnosed prevalent disease. Turnbull performed well, though was inefficient and not smooth. The logistic-Weibull model performed well, except when event times didn't follow a Weibull distribution. We have demonstrated that methods for right-censored data, such as Kaplan-Meier, result in biased estimates of disease risks when applied to interval-censored data, such as screening programs using EHR data. The logistic-Weibull model is attractive, but the model fit must be checked against Turnbull non-parametric risk estimates.

摘要

电子健康记录 (EHR) 越来越多地被研究疾病监测检测后疾病的流行病学家使用，以提供筛查间隔和转诊指南的证据。尽管具有成本效益，但在估计风险时，无法使用 Kaplan-Meier 方法解决未确诊的普遍疾病和区间 censoring（其中无症状疾病仅在检测时观察到）带来的重大分析问题。基于我们分析来自宫颈癌筛查的 EHR 的经验，我们之前提出了逻辑斯谛-Weibull 模型来解决这些问题。在这里，我们展示了统计方法的选择如何影响风险估计。我们使用 Kaiser Permanente Northern California 宫颈癌筛查计划中 41067 名女性的观察数据和模拟数据，评估不同方法（Kaplan-Meier、Turnbull、Weibull 和逻辑斯谛-Weibull）在筛查计划中准确估计风险的能力。来自统计方法的累积风险估计差异很大，当基线疾病确定是随机但不完整时，最主要的差异发生在普遍疾病风险上。Kaplan-Meier 在存在区间 censoring 或未确诊的普遍疾病时，在早期低估风险，在后期高估风险。Turnbull 表现良好，但效率低下且不光滑。逻辑斯谛-Weibull 模型表现良好，除非事件时间不符合 Weibull 分布。我们已经证明，对于右删失数据的方法，如 Kaplan-Meier，在应用于区间删失数据（例如使用 EHR 数据的筛查计划）时，会导致疾病风险的估计偏倚。逻辑斯谛-Weibull 模型很有吸引力，但必须根据 Turnbull 非参数风险估计来检查模型拟合。

相似文献

Challenges in risk estimation using routinely collected clinical data: The example of estimating cervical cancer risks from electronic health-records.

Prev Med. 2018 Jun;111:429-435. doi: 10.1016/j.ypmed.2017.12.004. Epub 2017 Dec 6.

Mixture models for undiagnosed prevalent disease and interval-censored incident disease: applications to a cohort assembled from electronic health records.

Stat Med. 2017 Sep 30;36(22):3583-3595. doi: 10.1002/sim.7380. Epub 2017 Jun 28.

Cervical cancer risk for women undergoing concurrent testing for human papillomavirus and cervical cytology: a population-based study in routine clinical practice.

Lancet Oncol. 2011 Jul;12(7):663-72. doi: 10.1016/S1470-2045(11)70145-0. Epub 2011 Jun 16.

Obstet Gynecol. 2016 Dec;128(6):1248-1257. doi: 10.1097/AOG.0000000000001721.

FLEXIBLE RISK PREDICTION MODELS FOR LEFT OR INTERVAL-CENSORED DATA FROM ELECTRONIC HEALTH RECORDS.

Ann Appl Stat. 2017 Jun;11(2):1063-1084. doi: 10.1214/17-AOAS1036. Epub 2017 Jul 20.

Five-Year Risk of Cervical Precancer Following p16/Ki-67 Dual-Stain Triage of HPV-Positive Women.

JAMA Oncol. 2019 Feb 1;5(2):181-186. doi: 10.1001/jamaoncol.2018.4270.

Sample-weighted semiparametric estimation of cause-specific cumulative risk and incidence using left- or interval-censored data from electronic health records.

Stat Med. 2020 Aug 15;39(18):2387-2402. doi: 10.1002/sim.8544. Epub 2020 May 10.

Estimated Cancer Risk in Females Who Meet the Criteria to Exit Cervical Cancer Screening.

JAMA Netw Open. 2025 Mar 3;8(3):e250479. doi: 10.1001/jamanetworkopen.2025.0479.

Adherence to Cervical Cancer Screening Guidelines Among Women Aged 66-68 Years in a Large Community-Based Practice.

Am J Prev Med. 2019 Dec;57(6):757-764. doi: 10.1016/j.amepre.2019.08.011.

Risk Stratification Using Human Papillomavirus Testing among Women with Equivocally Abnormal Cytology: Results from a State-Wide Surveillance Program.

Cancer Epidemiol Biomarkers Prev. 2016 Jan;25(1):36-42. doi: 10.1158/1055-9965.EPI-15-0669. Epub 2015 Oct 30.

引用本文的文献

Risk assessment in a Chinese cohort of 96 318 females undergoing opportunistic cervical cancer screening.

Oncologist. 2025 Jul 4;30(7). doi: 10.1093/oncolo/oyaf197.

Untreated cervical intraepithelial neoplasia grade 2 and subsequent risk of cervical cancer: population based cohort study.

BMJ. 2023 Nov 29;383:e075925. doi: 10.1136/bmj-2023-075925.

The combined finding of HPV 16, 18, or 45 and cytologic Atypical Glandular Cells (AGC) indicates a greatly elevated risk of in situ and invasive cervical adenocarcinoma.

Gynecol Oncol. 2023 Jul;174:253-261. doi: 10.1016/j.ygyno.2023.05.011. Epub 2023 May 25.

Survival Modelling For Data From Combined Cohorts: Opening the Door to Meta Survival Analyses and Survival Analysis using Electronic Health Records.

Int Stat Rev. 2023 Apr;91(1):72-87. doi: 10.1111/insr.12510. Epub 2022 Jun 16.

The Improving Risk Informed HPV Screening (IRIS) Study: Design and Baseline Characteristics.

Cancer Epidemiol Biomarkers Prev. 2022 Feb;31(2):486-492. doi: 10.1158/1055-9965.EPI-21-0865. Epub 2021 Nov 17.

Influential Usage of Big Data and Artificial Intelligence in Healthcare.

Comput Math Methods Med. 2021 Sep 6;2021:5812499. doi: 10.1155/2021/5812499. eCollection 2021.

A study of type-specific HPV natural history and implications for contemporary cervical cancer screening programs.

EClinicalMedicine. 2020 Apr 25;22:100293. doi: 10.1016/j.eclinm.2020.100293. eCollection 2020 May.

A Study of Partial Human Papillomavirus Genotyping in Support of the 2019 ASCCP Risk-Based Management Consensus Guidelines.

J Low Genit Tract Dis. 2020 Apr;24(2):144-147. doi: 10.1097/LGT.0000000000000530.

Risk Estimates Supporting the 2019 ASCCP Risk-Based Management Consensus Guidelines.

J Low Genit Tract Dis. 2020 Apr;24(2):132-143. doi: 10.1097/LGT.0000000000000529.

2019 ASCCP Risk-Based Management Consensus Guidelines: Methods for Risk Estimation, Recommended Management, and Validation.

J Low Genit Tract Dis. 2020 Apr;24(2):90-101. doi: 10.1097/LGT.0000000000000528.

本文引用的文献

FLEXIBLE RISK PREDICTION MODELS FOR LEFT OR INTERVAL-CENSORED DATA FROM ELECTRONIC HEALTH RECORDS.

Ann Appl Stat. 2017 Jun;11(2):1063-1084. doi: 10.1214/17-AOAS1036. Epub 2017 Jul 20.

Mixture models for undiagnosed prevalent disease and interval-censored incident disease: applications to a cohort assembled from electronic health records.

Stat Med. 2017 Sep 30;36(22):3583-3595. doi: 10.1002/sim.7380. Epub 2017 Jun 28.

Efficacy of HPV-based screening for prevention of invasive cervical cancer: follow-up of four European randomised controlled trials.

Lancet. 2014 Feb 8;383(9916):524-32. doi: 10.1016/S0140-6736(13)62218-7. Epub 2013 Nov 3.

2012 updated consensus guidelines for the management of abnormal cervical cancer screening tests and cancer precursors.

Obstet Gynecol. 2013 Apr;121(4):829-846. doi: 10.1097/AOG.0b013e3182883a34.

Understanding sources of bias in diagnostic accuracy studies.

Arch Pathol Lab Med. 2013 Apr;137(4):558-65. doi: 10.5858/arpa.2012-0198-RA.

Five-year risks of CIN 3+ and cervical cancer among women who test Pap-negative but are HPV-positive.

J Low Genit Tract Dis. 2013 Apr;17(5 Suppl 1):S56-63. doi: 10.1097/LGT.0b013e318285437b.

Five-year risks of CIN 2+ and CIN 3+ among women with HPV-positive and HPV-negative LSIL Pap results.

J Low Genit Tract Dis. 2013 Apr;17(5 Suppl 1):S43-9. doi: 10.1097/LGT.0b013e3182854269.

Five-year risks of CIN 3+ and cervical cancer among women with HPV testing of ASC-US Pap results.

J Low Genit Tract Dis. 2013 Apr;17(5 Suppl 1):S36-42. doi: 10.1097/LGT.0b013e3182854253.

Benchmarking CIN 3+ risk as the basis for incorporating HPV and Pap cotesting into cervical screening and management guidelines.

J Low Genit Tract Dis. 2013 Apr;17(5 Suppl 1):S28-35. doi: 10.1097/LGT.0b013e318285423c.

Long term predictive values of cytology and human papillomavirus testing in cervical cancer screening: joint European cohort study.

BMJ. 2008 Oct 13;337:a1754. doi: 10.1136/bmj.a1754.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用常规临床数据进行风险评估的挑战：以电子健康记录估算宫颈癌风险为例。

Challenges in risk estimation using routinely collected clinical data: The example of estimating cervical cancer risks from electronic health-records.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献