Suppr超能文献

用于个人风险模型外部验证的两阶段抽样设计。

Two-stage sampling designs for external validation of personal risk models.

作者信息

Whittemore Alice S, Halpern Jerry

机构信息

Department of Health Research and Policy, Stanford University School of Medicine, Stanford, CA, USA

Department of Health Research and Policy, Stanford University School of Medicine, Stanford, CA, USA.

出版信息

Stat Methods Med Res. 2016 Aug;25(4):1313-29. doi: 10.1177/0962280213480420. Epub 2013 Apr 16.

Abstract

We propose a cost-effective sampling design and estimating procedure for validating personal risk models using right-censored cohort data. Validation involves using each subject's covariates, as ascertained at cohort entry, in a risk model (specified independently of the data) to assign him/her a probability of an adverse outcome within a future time period. Subjects are then grouped according to the magnitudes of their assigned risks, and within each group, the mean assigned risk is compared with the probability of outcome occurrence as estimated using the follow-up data. Such validation presents two complications. First, in the presence of right-censoring, estimating the probability of developing the outcomes before death requires competing risk analysis. Second, for rare outcomes, validation using the full cohort requires assembling covariates and assigning risks to thousands of subjects. This can be costly when some covariates involve analyzing biological specimens. A two-stage sampling design addresses this problem by assembling covariates and assigning risks only to those subjects most informative for estimating key parameters. We use this design to estimate the outcome probabilities needed to evaluate model performance and we provide theoretical and bootstrap estimates of their variances. We also describe how to choose two-stage designs with minimal efficiency loss for a parameter of interest when the quantities determining optimality are unknown at the time of design. We illustrate these methods by using subjects in the California Teachers Study to validate ovarian cancer risk models. We find that a design with optimal efficiency for one performance parameter need not be so for others, and trade-offs will be required. A two-stage design that samples all outcome-positive subjects and more outcome-negative than censored subjects will perform well in most circumstances. The methods are implemented in Risk Model Assessment Program, an R program freely available at http://med.stanford.edu/epidemiology/two-stage.html.

摘要

我们提出了一种经济高效的抽样设计和估计程序,用于使用右删失队列数据验证个人风险模型。验证过程包括在一个风险模型(独立于数据指定)中使用每个受试者在队列进入时确定的协变量,为其分配在未来时间段内出现不良结局的概率。然后根据分配风险的大小对受试者进行分组,在每组中,将平均分配风险与使用随访数据估计的结局发生概率进行比较。这种验证存在两个复杂问题。首先,在存在右删失的情况下,估计在死亡前发生结局的概率需要竞争风险分析。其次,对于罕见结局,使用整个队列进行验证需要收集数千名受试者的协变量并为其分配风险。当一些协变量涉及分析生物标本时,这可能成本很高。两阶段抽样设计通过仅收集协变量并仅为那些对估计关键参数最具信息价值的受试者分配风险来解决此问题。我们使用这种设计来估计评估模型性能所需的结局概率,并提供其方差的理论估计和自助法估计。我们还描述了在设计时确定最优性的数量未知的情况下,如何选择效率损失最小的两阶段设计来估计感兴趣的参数。我们通过使用加利福尼亚教师研究中的受试者来验证卵巢癌风险模型来说明这些方法。我们发现,对于一个性能参数具有最优效率的设计对于其他参数不一定如此,需要进行权衡。在大多数情况下,一种对所有结局阳性受试者以及比删失受试者更多的结局阴性受试者进行抽样的两阶段设计将表现良好。这些方法在风险模型评估程序(Risk Model Assessment Program)中实现,这是一个可在http://med.stanford.edu/epidemiology/two-stage.html免费获取的R程序。

相似文献

1
Two-stage sampling designs for external validation of personal risk models.
Stat Methods Med Res. 2016 Aug;25(4):1313-29. doi: 10.1177/0962280213480420. Epub 2013 Apr 16.
2
Assessing the goodness of fit of personal risk models.
Stat Med. 2014 Aug 15;33(18):3179-90. doi: 10.1002/sim.6176. Epub 2014 Apr 22.
3
Evaluating disease prediction models using a cohort whose covariate distribution differs from that of the target population.
Stat Methods Med Res. 2019 Jan;28(1):309-320. doi: 10.1177/0962280217723945. Epub 2017 Aug 16.
7
Logistic regression of family data from retrospective study designs.
Genet Epidemiol. 2003 Nov;25(3):177-89. doi: 10.1002/gepi.10267.
8
Estimating subject-specific dependent competing risk profile with censored event time observations.
Biometrics. 2011 Jun;67(2):427-35. doi: 10.1111/j.1541-0420.2010.01456.x. Epub 2010 Jul 9.
9
Robust risk prediction with biomarkers under two-phase stratified cohort design.
Biometrics. 2016 Dec;72(4):1037-1045. doi: 10.1111/biom.12515. Epub 2016 Apr 1.
10

引用本文的文献

1
Comprehensive epithelial tubo-ovarian cancer risk prediction model incorporating genetic and epidemiological risk factors.
J Med Genet. 2022 Jul;59(7):632-643. doi: 10.1136/jmedgenet-2021-107904. Epub 2021 Nov 29.
2
Prospective Evaluation of the Addition of Polygenic Risk Scores to Breast Cancer Risk Models.
JNCI Cancer Spectr. 2021 Mar 2;5(3). doi: 10.1093/jncics/pkab021. eCollection 2021 Jun.
3
Assessing risk model calibration with missing covariates.
Biostatistics. 2022 Jul 18;23(3):875-890. doi: 10.1093/biostatistics/kxaa060.
4
iCARE: An R package to build, validate and apply absolute risk models.
PLoS One. 2020 Feb 5;15(2):e0228198. doi: 10.1371/journal.pone.0228198. eCollection 2020.
5
Validation of the IBIS breast cancer risk evaluator for women with lobular carcinoma in-situ.
Br J Cancer. 2018 Jul;119(1):36-39. doi: 10.1038/s41416-018-0120-z. Epub 2018 Jun 21.
6
Predicting Prostate Cancer Recurrence After Radical Prostatectomy.
Prostate. 2017 Feb;77(3):291-298. doi: 10.1002/pros.23268. Epub 2016 Oct 24.
7
Practical problems with clinical guidelines for breast cancer prevention based on remaining lifetime risk.
J Natl Cancer Inst. 2015 May 8;107(7). doi: 10.1093/jnci/djv124. Print 2015 Jul.
8
Assessing the goodness of fit of personal risk models.
Stat Med. 2014 Aug 15;33(18):3179-90. doi: 10.1002/sim.6176. Epub 2014 Apr 22.

本文引用的文献

1
Evaluating health risk models.
Stat Med. 2010 Oct 15;29(23):2438-52. doi: 10.1002/sim.3991.
2
Time-dependent predictive accuracy in the presence of competing risks.
Biometrics. 2010 Dec;66(4):999-1011. doi: 10.1111/j.1541-0420.2009.01375.x.
3
Assessing the value of risk predictions by using risk stratification tables.
Ann Intern Med. 2008 Nov 18;149(10):751-60. doi: 10.7326/0003-4819-149-10-200811180-00009.
4
An updated catalog of prostate cancer predictive tools.
Cancer. 2008 Dec 1;113(11):3075-99. doi: 10.1002/cncr.23908.
5
Systematic evaluation of candidate blood markers for detecting ovarian cancer.
PLoS One. 2008 Jul 9;3(7):e2633. doi: 10.1371/journal.pone.0002633.
6
Dietary patterns and risk of ovarian cancer in the California Teachers Study cohort.
Nutr Cancer. 2008;60(3):285-91. doi: 10.1080/01635580701733091.
7
Tutorial in biostatistics: competing risks and multi-state models.
Stat Med. 2007 May 20;26(11):2389-430. doi: 10.1002/sim.2712.
8
On criteria for evaluating models of absolute risk.
Biostatistics. 2005 Apr;6(2):227-39. doi: 10.1093/biostatistics/kxi005.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验