临床预测模型开发中的自适应样本量确定

Adaptive sample size determination for the development of clinical prediction models.

作者信息

Christodoulou Evangelia, van Smeden Maarten, Edlinger Michael, Timmerman Dirk, Wanitschek Maria, Steyerberg Ewout W, Van Calster Ben

机构信息

Department of Development & Regeneration, KU Leuven, Leuven, Belgium.

Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, Netherlands.

出版信息

Diagn Progn Res. 2021 Mar 22;5(1):6. doi: 10.1186/s41512-021-00096-5.

DOI:10.1186/s41512-021-00096-5

PMID:33745449

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7983402/

Abstract

BACKGROUND

We suggest an adaptive sample size calculation method for developing clinical prediction models, in which model performance is monitored sequentially as new data comes in.

METHODS

We illustrate the approach using data for the diagnosis of ovarian cancer (n = 5914, 33% event fraction) and obstructive coronary artery disease (CAD; n = 4888, 44% event fraction). We used logistic regression to develop a prediction model consisting only of a priori selected predictors and assumed linear relations for continuous predictors. We mimicked prospective patient recruitment by developing the model on 100 randomly selected patients, and we used bootstrapping to internally validate the model. We sequentially added 50 random new patients until we reached a sample size of 3000 and re-estimated model performance at each step. We examined the required sample size for satisfying the following stopping rule: obtaining a calibration slope ≥ 0.9 and optimism in the c-statistic (or AUC) < = 0.02 at two consecutive sample sizes. This procedure was repeated 500 times. We also investigated the impact of alternative modeling strategies: modeling nonlinear relations for continuous predictors and correcting for bias on the model estimates (Firth's correction).

RESULTS

Better discrimination was achieved in the ovarian cancer data (c-statistic 0.9 with 7 predictors) than in the CAD data (c-statistic 0.7 with 11 predictors). Adequate calibration and limited optimism in discrimination was achieved after a median of 450 patients (interquartile range 450-500) for the ovarian cancer data (22 events per parameter (EPP), 20-24) and 850 patients (750-900) for the CAD data (33 EPP, 30-35). A stricter criterion, requiring AUC optimism < = 0.01, was met with a median of 500 (23 EPP) and 1500 (59 EPP) patients, respectively. These sample sizes were much higher than the well-known 10 EPP rule of thumb and slightly higher than a recently published fixed sample size calculation method by Riley et al. Higher sample sizes were required when nonlinear relationships were modeled, and lower sample sizes when Firth's correction was used.

CONCLUSIONS

Adaptive sample size determination can be a useful supplement to fixed a priori sample size calculations, because it allows to tailor the sample size to the specific prediction modeling context in a dynamic fashion.

摘要

背景

我们提出一种用于开发临床预测模型的自适应样本量计算方法，在新数据不断输入时，对模型性能进行序贯监测。

方法

我们使用卵巢癌诊断数据（n = 5914，事件发生率33%）和阻塞性冠状动脉疾病（CAD；n = 4888，事件发生率44%）来说明该方法。我们使用逻辑回归开发一个仅由预先选定的预测因子组成的预测模型，并假设连续预测因子存在线性关系。我们通过在100名随机选择的患者上开发模型来模拟前瞻性患者招募，并使用自助法进行模型内部验证。我们序贯添加50名随机新患者，直至样本量达到3000，并在每一步重新估计模型性能。我们检查满足以下停止规则所需的样本量：在两个连续样本量时获得校准斜率≥0.9且c统计量（或AUC）的乐观度<=0.02。此过程重复500次。我们还研究了替代建模策略的影响：对连续预测因子建模非线性关系以及对模型估计值进行偏差校正（Firth校正）。

结果

卵巢癌数据（7个预测因子时c统计量为0.9）比CAD数据（11个预测因子时c统计量为0.7）实现了更好的区分度。对于卵巢癌数据（每个参数22个事件（EPP），20 - 24），在中位450名患者（四分位间距450 - 500）后实现了充分校准和有限的区分度乐观度；对于CAD数据（33个EPP，30 - 35），在850名患者（750 - 900）后实现。一个更严格的标准，即要求AUC乐观度<=0.01，分别在中位500名（23个EPP）和1500名（59个EPP）患者时满足。这些样本量远高于著名的10个EPP经验法则，且略高于Riley等人最近发表的固定样本量计算方法。当对非线性关系建模时需要更高的样本量，而使用Firth校正时样本量较低。

结论

自适应样本量确定可以作为对固定的先验样本量计算的有用补充，因为它允许以动态方式根据特定的预测建模背景调整样本量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a82f/7983402/00fcd84d46d3/41512_2021_96_Fig1_HTML.jpg

相似文献

Adaptive sample size determination for the development of clinical prediction models.临床预测模型开发中的自适应样本量确定

Diagn Progn Res. 2021 Mar 22;5(1):6. doi: 10.1186/s41512-021-00096-5.

Re-evaluation of the comparative effectiveness of bootstrap-based optimism correction methods in the development of multivariable clinical prediction models.基于 Bootstrap 的校正方法在多变量临床预测模型构建中的校正效能再评价。

BMC Med Res Methodol. 2021 Jan 7;21(1):9. doi: 10.1186/s12874-020-01201-w.

Regression shrinkage methods for clinical prediction models do not guarantee improved performance: Simulation study.回归收缩方法在临床预测模型中并不能保证性能得到改善：模拟研究。

Stat Methods Med Res. 2020 Nov;29(11):3166-3178. doi: 10.1177/0962280220921415. Epub 2020 May 13.

Sample sizes of prediction model studies in prostate cancer were rarely justified and often insufficient.前列腺癌预测模型研究的样本量很少有充分依据，且往往不足。

J Clin Epidemiol. 2021 May;133:53-60. doi: 10.1016/j.jclinepi.2020.12.011. Epub 2020 Dec 28.

External validation and extension of a diagnostic model for obstructive coronary artery disease: a cross-sectional predictive evaluation in 4888 patients of the Austrian Coronary Artery disease Risk Determination In Innsbruck by diaGnostic ANgiography (CARDIIGAN) cohort.一项阻塞性冠状动脉疾病诊断模型的外部验证和扩展：奥地利因斯布鲁克通过诊断性血管造影（CARDIIGAN）队列对 4888 例患者进行的横断面预测性评估。

BMJ Open. 2017 Apr 7;7(4):e014467. doi: 10.1136/bmjopen-2016-014467.

Firth's logistic regression with rare events: accurate effect estimates and predictions?针对罕见事件的费思逻辑回归：准确的效应估计与预测？

Stat Med. 2017 Jun 30;36(14):2302-2317. doi: 10.1002/sim.7273. Epub 2017 Mar 12.

Prognostic models for identifying risk of poor outcome in people with acute ankle sprains: the SPRAINED development and external validation study.用于识别急性踝关节扭伤患者不良结局风险的预测模型：SPRAINED 研究的开发和外部验证。

Health Technol Assess. 2018 Nov;22(64):1-112. doi: 10.3310/hta22640.

Empirical evaluation of internal validation methods for prediction in large-scale clinical data with rare-event outcomes: a case study in suicide risk prediction.大规模临床稀有事件结局数据预测中内部验证方法的实证评估：以自杀风险预测为例

BMC Med Res Methodol. 2023 Feb 1;23(1):33. doi: 10.1186/s12874-023-01844-5.

Sample size considerations and predictive performance of multinomial logistic prediction models.多分类逻辑回归预测模型的样本量考虑因素和预测性能。

Stat Med. 2019 Apr 30;38(9):1601-1619. doi: 10.1002/sim.8063. Epub 2019 Jan 6.

Minimum sample size for external validation of a clinical prediction model with a binary outcome.具有二元结局的临床预测模型外部验证的最小样本量

Stat Med. 2021 Aug 30;40(19):4230-4251. doi: 10.1002/sim.9025. Epub 2021 May 24.

引用本文的文献

A decomposition of Fisher's information to inform sample size for developing or updating fair and precise clinical prediction models for individual risk-part 1: binary outcomes.分解费舍尔信息以确定样本量，用于开发或更新针对个体风险的公平且精确的临床预测模型——第1部分：二元结局

Diagn Progn Res. 2025 Jul 8;9(1):14. doi: 10.1186/s41512-025-00193-9.

Machine Learning Model for Predicting Coronary Heart Disease Risk: Development and Validation Using Insights From a Japanese Population-Based Study.预测冠心病风险的机器学习模型：基于日本人群研究的见解进行开发与验证

JMIR Cardio. 2025 May 12;9:e68066. doi: 10.2196/68066.

Scalable de novo classification of antibiotic resistance of Mycobacterium tuberculosis.结核分枝杆菌抗生素耐药性的可扩展从头分类。

Bioinformatics. 2024 Jun 28;40(Suppl 1):i39-i47. doi: 10.1093/bioinformatics/btae243.

Longitudinal Resilience and Risk Factors in Pediatric Postoperative Pain (LORRIS): Protocol for a Prospective Longitudinal Swiss University Children's Hospitals-Based Study.《儿科术后疼痛的纵向韧性和风险因素研究（LORRIS）》：一项基于瑞士大学儿童医院的前瞻性纵向研究方案。

BMJ Open. 2024 Mar 28;14(3):e080174. doi: 10.1136/bmjopen-2023-080174.

Clinical prediction models and the multiverse of madness.临床预测模型与疯狂的多元宇宙。

BMC Med. 2023 Dec 18;21(1):502. doi: 10.1186/s12916-023-03212-y.

Prognostic Models in Nephrology: Where Do We Stand and Where Do We Go from Here? Mapping Out the Evidence in a Scoping Review.肾脏病预后模型：我们处于何处，以及我们从何处出发？在范围综述中描绘证据。

J Am Soc Nephrol. 2024 Mar 1;35(3):367-380. doi: 10.1681/ASN.0000000000000285. Epub 2023 Dec 12.

Transparent reporting of multivariable prediction models developed or validated using clustered data (TRIPOD-Cluster): explanation and elaboration.透明报告使用聚类数据开发或验证的多变量预测模型（TRIPOD-Cluster）：解释和说明。

BMJ. 2023 Feb 7;380:e071058. doi: 10.1136/bmj-2022-071058.

Critical appraisal of artificial intelligence-based prediction models for cardiovascular disease.人工智能在心血管疾病预测模型中的应用评价。

Eur Heart J. 2022 Aug 14;43(31):2921-2930. doi: 10.1093/eurheartj/ehac238.

Uncertainty and the Value of Information in Risk Prediction Modeling.不确定性与信息价值在风险预测模型中的应用。

Med Decis Making. 2022 Jul;42(5):661-671. doi: 10.1177/0272989X221078789. Epub 2022 Feb 25.

Guidelines and quality criteria for artificial intelligence-based prediction models in healthcare: a scoping review.医疗保健中基于人工智能的预测模型的指南和质量标准：一项范围综述

NPJ Digit Med. 2022 Jan 10;5(1):2. doi: 10.1038/s41746-021-00549-7.

本文引用的文献

Stat Methods Med Res. 2020 Nov;29(11):3166-3178. doi: 10.1177/0962280220921415. Epub 2020 May 13.

Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal.COVID-19 诊断和预后预测模型：系统评价和批判性评估。

BMJ. 2020 Apr 7;369:m1328. doi: 10.1136/bmj.m1328.

Calculating the sample size required for developing a clinical prediction model.计算开发临床预测模型所需的样本量。

BMJ. 2020 Mar 18;368:m441. doi: 10.1136/bmj.m441.

Tufts PACE Clinical Predictive Model Registry: update 1990 through 2015.塔夫茨PACE临床预测模型注册库：1990年至2015年更新

Diagn Progn Res. 2017 Dec 21;1:20. doi: 10.1186/s41512-017-0021-2. eCollection 2017.

Minimum sample size for developing a multivariable prediction model: PART II - binary and time-to-event outcomes.建立多变量预测模型的最小样本量：第二部分 - 二分类和生存数据。

Stat Med. 2019 Mar 30;38(7):1276-1296. doi: 10.1002/sim.7992. Epub 2018 Oct 24.

Minimum sample size for developing a multivariable prediction model: Part I - Continuous outcomes.建立多变量预测模型的最小样本量：第一部分-连续结局。

Stat Med. 2019 Mar 30;38(7):1262-1275. doi: 10.1002/sim.7993. Epub 2018 Oct 22.

Sample size for binary logistic prediction models: Beyond events per variable criteria.二项逻辑预测模型的样本量：超越变量标准的事件数。

Stat Methods Med Res. 2019 Aug;28(8):2455-2474. doi: 10.1177/0962280218784726. Epub 2018 Jul 3.

Poor performance of clinical prediction models: the harm of commonly applied methods.临床预测模型表现不佳：常用方法的危害。

J Clin Epidemiol. 2018 Jun;98:133-143. doi: 10.1016/j.jclinepi.2017.11.013. Epub 2017 Nov 24.

BMJ Open. 2017 Apr 7;7(4):e014467. doi: 10.1136/bmjopen-2016-014467.

No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.二元逻辑回归分析中每10个事件对应1个变量的标准没有理论依据。

BMC Med Res Methodol. 2016 Nov 24;16(1):163. doi: 10.1186/s12874-016-0267-3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

临床预测模型开发中的自适应样本量确定

Adaptive sample size determination for the development of clinical prediction models.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献