研究开发二分类结局预测模型时未考虑样本量要求：系统评价。

Sample size requirements are not being considered in studies developing prediction models for binary outcomes: a systematic review.

机构信息

Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK.

Population Data Science, Faculty of Medicine, Health and Life Science, Swansea University Medical School, Swansea University, Singleton Park, Swansea, SA2 8PP, UK.

出版信息

BMC Med Res Methodol. 2023 Aug 19;23(1):188. doi: 10.1186/s12874-023-02008-1.

DOI:10.1186/s12874-023-02008-1

PMID:37598153

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10439652/

Abstract

BACKGROUND

Having an appropriate sample size is important when developing a clinical prediction model. We aimed to review how sample size is considered in studies developing a prediction model for a binary outcome.

METHODS

We searched PubMed for studies published between 01/07/2020 and 30/07/2020 and reviewed the sample size calculations used to develop the prediction models. Using the available information, we calculated the minimum sample size that would be needed to estimate overall risk and minimise overfitting in each study and summarised the difference between the calculated and used sample size.

RESULTS

A total of 119 studies were included, of which nine studies provided sample size justification (8%). The recommended minimum sample size could be calculated for 94 studies: 73% (95% CI: 63-82%) used sample sizes lower than required to estimate overall risk and minimise overfitting including 26% studies that used sample sizes lower than required to estimate overall risk only. A similar number of studies did not meet the ≥ 10EPV criteria (75%, 95% CI: 66-84%). The median deficit of the number of events used to develop a model was 75 [IQR: 234 lower to 7 higher]) which reduced to 63 if the total available data (before any data splitting) was used [IQR:225 lower to 7 higher]. Studies that met the minimum required sample size had a median c-statistic of 0.84 (IQR:0.80 to 0.9) and studies where the minimum sample size was not met had a median c-statistic of 0.83 (IQR: 0.75 to 0.9). Studies that met the ≥ 10 EPP criteria had a median c-statistic of 0.80 (IQR: 0.73 to 0.84).

CONCLUSIONS

Prediction models are often developed with no sample size calculation, as a consequence many are too small to precisely estimate the overall risk. We encourage researchers to justify, perform and report sample size calculations when developing a prediction model.

摘要

背景

在开发临床预测模型时，适当的样本量很重要。我们旨在回顾在开发二分类结局预测模型的研究中，样本量是如何考虑的。

方法

我们在 PubMed 上检索了 2020 年 7 月 1 日至 7 月 30 日期间发表的研究，并回顾了用于开发预测模型的样本量计算。利用可用信息，我们计算了在每项研究中估计总体风险和最小化过拟合所需的最小样本量，并总结了计算样本量与使用样本量之间的差异。

结果

共纳入 119 项研究，其中 9 项研究（8%）提供了样本量依据。可计算 94 项研究的建议最小样本量：73%（95%CI：63-82%）使用的样本量低于估计总体风险和最小化过拟合所需的样本量，包括 26%的研究仅使用低于估计总体风险所需的样本量。未满足≥10EPV 标准的研究数量相似（75%，95%CI：66-84%）。用于开发模型的事件数量中位数不足[IQR：234 低至 7 高]，如果使用总可用数据（在任何数据分割之前）则减少至 63[IQR：225 低至 7 高]。满足最小所需样本量的研究的中位 c 统计量为 0.84（IQR：0.80 至 0.9），未满足最小样本量的研究的中位 c 统计量为 0.83（IQR：0.75 至 0.9）。满足≥10EPP 标准的研究的中位 c 统计量为 0.80（IQR：0.73 至 0.84）。

结论

预测模型通常在没有样本量计算的情况下开发，因此许多模型都太小，无法准确估计总体风险。我们鼓励研究人员在开发预测模型时，证明、执行和报告样本量计算。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/10439652/d6c98607c768/12874_2023_2008_Fig1_HTML.jpg

相似文献

Sample size requirements are not being considered in studies developing prediction models for binary outcomes: a systematic review.研究开发二分类结局预测模型时未考虑样本量要求：系统评价。

BMC Med Res Methodol. 2023 Aug 19;23(1):188. doi: 10.1186/s12874-023-02008-1.

Larger sample sizes are needed when developing a clinical prediction model using machine learning in oncology: methodological systematic review.在肿瘤学中使用机器学习开发临床预测模型时需要更大的样本量：方法学系统评价

J Clin Epidemiol. 2025 Apr;180:111675. doi: 10.1016/j.jclinepi.2025.111675. Epub 2025 Jan 13.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Overlooked and underpowered: a meta-research addressing sample size in radiomics prediction models for binary outcomes.被忽视且样本量不足：一项针对二元结局的放射组学预测模型样本量的元研究。

Eur Radiol. 2025 Mar;35(3):1146-1156. doi: 10.1007/s00330-024-11331-0. Epub 2025 Jan 9.

An evaluation of sample size requirements for developing risk prediction models with binary outcomes.评估二分类结局风险预测模型的样本量需求。

BMC Med Res Methodol. 2024 Jul 10;24(1):146. doi: 10.1186/s12874-024-02268-5.

Minimum sample size for external validation of a clinical prediction model with a binary outcome.具有二元结局的临床预测模型外部验证的最小样本量

Stat Med. 2021 Aug 30;40(19):4230-4251. doi: 10.1002/sim.9025. Epub 2021 May 24.

Developing clinical prediction models when adhering to minimum sample size recommendations: The importance of quantifying bootstrap variability in tuning parameters and predictive performance.在遵守最小样本量建议的情况下开发临床预测模型：在调整参数和预测性能时量化引导变异性的重要性。

Stat Methods Med Res. 2021 Dec;30(12):2545-2561. doi: 10.1177/09622802211046388. Epub 2021 Oct 8.

There are Considerable Inconsistencies Among Minimum Clinically Important Differences in TKA: A Systematic Review.全膝关节置换术最小临床重要差异存在显著差异：系统评价。

Clin Orthop Relat Res. 2023 Jan 1;481(1):63-80. doi: 10.1097/CORR.0000000000002440. Epub 2022 Oct 5.

Methodological conduct of prognostic prediction models developed using machine learning in oncology: a systematic review.基于机器学习的肿瘤预后预测模型的方法学研究：系统评价。

BMC Med Res Methodol. 2022 Apr 8;22(1):101. doi: 10.1186/s12874-022-01577-x.

Minimum sample size for developing a multivariable prediction model: PART II - binary and time-to-event outcomes.建立多变量预测模型的最小样本量：第二部分 - 二分类和生存数据。

Stat Med. 2019 Mar 30;38(7):1276-1296. doi: 10.1002/sim.7992. Epub 2018 Oct 24.

引用本文的文献

Development and validation of a prognostic model incorporating patient reported outcomes for advanced gastric and esophageal carcinoma (AGOC) using individual patient data from two AGITG randomized clinical trials.利用两项AGITG随机临床试验的个体患者数据，开发并验证一个纳入患者报告结局的晚期胃癌和食管癌（AGOC）预后模型。

Gastric Cancer. 2025 Sep 16. doi: 10.1007/s10120-025-01654-2.

Development and Validation of a Nomogram for Predicting Hyperuricemia in Perimenopausal Women.预测围绝经期女性高尿酸血症的列线图的开发与验证

Int J Gen Med. 2025 Sep 4;18:5171-5182. doi: 10.2147/IJGM.S538751. eCollection 2025.

Determinants of visual functions in patients with early and intermediate age-related macular degeneration: the PEONY study.早中期年龄相关性黄斑变性患者视觉功能的决定因素：芍药研究

Eye (Lond). 2025 Jul 21. doi: 10.1038/s41433-025-03931-x.

Development and validation of a predictive model for in-hospital mortality in patients with coronary heart disease and renal insufficiency.冠心病合并肾功能不全患者院内死亡预测模型的开发与验证

Int J Cardiol Cardiovasc Risk Prev. 2025 Jul 1;26:200463. doi: 10.1016/j.ijcrp.2025.200463. eCollection 2025 Sep.

Predicting postprandial glucose excursions to personalize dietary interventions for type-2 diabetes management.预测餐后血糖波动以个性化定制2型糖尿病管理的饮食干预措施。

Sci Rep. 2025 Jul 17;15(1):25920. doi: 10.1038/s41598-025-08003-4.

Fall Risk and Knowledge of Fall-Risk-Increasing Drugs Among Saudi Older Adults.沙特老年人群中的跌倒风险及对增加跌倒风险药物的认知

Healthcare (Basel). 2025 Jun 29;13(13):1549. doi: 10.3390/healthcare13131549.

A decomposition of Fisher's information to inform sample size for developing or updating fair and precise clinical prediction models for individual risk-part 1: binary outcomes.分解费舍尔信息以确定样本量，用于开发或更新针对个体风险的公平且精确的临床预测模型——第1部分：二元结局

Diagn Progn Res. 2025 Jul 8;9(1):14. doi: 10.1186/s41512-025-00193-9.

Evaluating the sample size requirements of tree-based ensemble machine learning techniques for clinical risk prediction.评估基于树的集成机器学习技术在临床风险预测中的样本量要求。

Stat Methods Med Res. 2025 Jul;34(7):1356-1372. doi: 10.1177/09622802251338983. Epub 2025 May 14.

Putting computational models of immunity to the test-An invited challenge to predict B.pertussis vaccination responses.对免疫计算模型进行测试——预测百日咳疫苗接种反应的特邀挑战

PLoS Comput Biol. 2025 Mar 31;21(3):e1012927. doi: 10.1371/journal.pcbi.1012927. eCollection 2025 Mar.

Uncertainty of risk estimates from clinical prediction models: rationale, challenges, and approaches.临床预测模型风险估计的不确定性：基本原理、挑战及应对方法。

BMJ. 2025 Feb 13;388:e080749. doi: 10.1136/bmj-2024-080749.

本文引用的文献

Stability of clinical prediction models developed using statistical or machine learning methods.基于统计或机器学习方法开发的临床预测模型的稳定性。

Biom J. 2023 Dec;65(8):e2200302. doi: 10.1002/bimj.202200302. Epub 2023 Jul 19.

Minimum sample size for developing a multivariable prediction model using multinomial logistic regression.使用多项逻辑回归开发多变量预测模型的最小样本量。

Stat Methods Med Res. 2023 Mar;32(3):555-571. doi: 10.1177/09622802231151220. Epub 2023 Jan 19.

Systematic review identifies the design and methodological conduct of studies on machine learning-based prediction models.系统评价确定了基于机器学习的预测模型研究的设计和方法实施情况。

J Clin Epidemiol. 2023 Feb;154:8-22. doi: 10.1016/j.jclinepi.2022.11.015. Epub 2022 Nov 25.

BMC Med Res Methodol. 2022 Apr 8;22(1):101. doi: 10.1186/s12874-022-01577-x.

Clinical prediction models in psychiatry: a systematic review of two decades of progress and challenges.精神病学中的临床预测模型：二十年进展与挑战的系统回顾。

Mol Psychiatry. 2022 Jun;27(6):2700-2708. doi: 10.1038/s41380-022-01528-4. Epub 2022 Apr 1.

Completeness of reporting of clinical prediction models developed using supervised machine learning: a systematic review.基于监督机器学习开发的临床预测模型报告的完整性：系统评价。

BMC Med Res Methodol. 2022 Jan 13;22(1):12. doi: 10.1186/s12874-021-01469-6.

Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review.基于监督机器学习技术开发的预测模型研究中的偏倚风险：系统评价。

BMJ. 2021 Oct 20;375:n2281. doi: 10.1136/bmj.n2281.

Reporting of prognostic clinical prediction models based on machine learning methods in oncology needs to be improved.基于机器学习方法的肿瘤预后临床预测模型报告需要改进。

J Clin Epidemiol. 2021 Oct;138:60-72. doi: 10.1016/j.jclinepi.2021.06.024. Epub 2021 Jun 29.

Minimum sample size for external validation of a clinical prediction model with a binary outcome.具有二元结局的临床预测模型外部验证的最小样本量

Stat Med. 2021 Aug 30;40(19):4230-4251. doi: 10.1002/sim.9025. Epub 2021 May 24.

The PRISMA 2020 statement: an updated guideline for reporting systematic reviews.PRISMA 2020 声明：系统评价报告的更新指南。

BMJ. 2021 Mar 29;372:n71. doi: 10.1136/bmj.n71.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

研究开发二分类结局预测模型时未考虑样本量要求：系统评价。

Sample size requirements are not being considered in studies developing prediction models for binary outcomes: a systematic review.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献