Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, OX3 7LD, UK.
Population Data Science, Faculty of Medicine, Health and Life Science, Swansea University Medical School, Swansea University, Singleton Park, Swansea, SA2 8PP, UK.
BMC Med Res Methodol. 2023 Aug 19;23(1):188. doi: 10.1186/s12874-023-02008-1.
Having an appropriate sample size is important when developing a clinical prediction model. We aimed to review how sample size is considered in studies developing a prediction model for a binary outcome.
We searched PubMed for studies published between 01/07/2020 and 30/07/2020 and reviewed the sample size calculations used to develop the prediction models. Using the available information, we calculated the minimum sample size that would be needed to estimate overall risk and minimise overfitting in each study and summarised the difference between the calculated and used sample size.
A total of 119 studies were included, of which nine studies provided sample size justification (8%). The recommended minimum sample size could be calculated for 94 studies: 73% (95% CI: 63-82%) used sample sizes lower than required to estimate overall risk and minimise overfitting including 26% studies that used sample sizes lower than required to estimate overall risk only. A similar number of studies did not meet the ≥ 10EPV criteria (75%, 95% CI: 66-84%). The median deficit of the number of events used to develop a model was 75 [IQR: 234 lower to 7 higher]) which reduced to 63 if the total available data (before any data splitting) was used [IQR:225 lower to 7 higher]. Studies that met the minimum required sample size had a median c-statistic of 0.84 (IQR:0.80 to 0.9) and studies where the minimum sample size was not met had a median c-statistic of 0.83 (IQR: 0.75 to 0.9). Studies that met the ≥ 10 EPP criteria had a median c-statistic of 0.80 (IQR: 0.73 to 0.84).
Prediction models are often developed with no sample size calculation, as a consequence many are too small to precisely estimate the overall risk. We encourage researchers to justify, perform and report sample size calculations when developing a prediction model.
在开发临床预测模型时,适当的样本量很重要。我们旨在回顾在开发二分类结局预测模型的研究中,样本量是如何考虑的。
我们在 PubMed 上检索了 2020 年 7 月 1 日至 7 月 30 日期间发表的研究,并回顾了用于开发预测模型的样本量计算。利用可用信息,我们计算了在每项研究中估计总体风险和最小化过拟合所需的最小样本量,并总结了计算样本量与使用样本量之间的差异。
共纳入 119 项研究,其中 9 项研究(8%)提供了样本量依据。可计算 94 项研究的建议最小样本量:73%(95%CI:63-82%)使用的样本量低于估计总体风险和最小化过拟合所需的样本量,包括 26%的研究仅使用低于估计总体风险所需的样本量。未满足≥10EPV 标准的研究数量相似(75%,95%CI:66-84%)。用于开发模型的事件数量中位数不足[IQR:234 低至 7 高],如果使用总可用数据(在任何数据分割之前)则减少至 63[IQR:225 低至 7 高]。满足最小所需样本量的研究的中位 c 统计量为 0.84(IQR:0.80 至 0.9),未满足最小样本量的研究的中位 c 统计量为 0.83(IQR:0.75 至 0.9)。满足≥10EPP 标准的研究的中位 c 统计量为 0.80(IQR:0.73 至 0.84)。
预测模型通常在没有样本量计算的情况下开发,因此许多模型都太小,无法准确估计总体风险。我们鼓励研究人员在开发预测模型时,证明、执行和报告样本量计算。