1 Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands.
2 Centre for Statistics in Medicine, Botnar Research Centre, University of Oxford, Oxford, UK.
Stat Methods Med Res. 2019 Aug;28(8):2455-2474. doi: 10.1177/0962280218784726. Epub 2018 Jul 3.
Binary logistic regression is one of the most frequently applied statistical approaches for developing clinical prediction models. Developers of such models often rely on an Events Per Variable criterion (EPV), notably EPV ≥10, to determine the minimal sample size required and the maximum number of candidate predictors that can be examined. We present an extensive simulation study in which we studied the influence of EPV, events fraction, number of candidate predictors, the correlations and distributions of candidate predictor variables, area under the ROC curve, and predictor effects on out-of-sample predictive performance of prediction models. The out-of-sample performance (calibration, discrimination and probability prediction error) of developed prediction models was studied before and after regression shrinkage and variable selection. The results indicate that EPV does not have a strong relation with metrics of predictive performance, and is not an appropriate criterion for (binary) prediction model development studies. We show that out-of-sample predictive performance can better be approximated by considering the number of predictors, the total sample size and the events fraction. We propose that the development of new sample size criteria for prediction models should be based on these three parameters, and provide suggestions for improving sample size determination.
二元逻辑回归是开发临床预测模型最常用的统计方法之一。此类模型的开发者通常依赖于事件数与变量数比(EPV)标准,特别是 EPV≥10,来确定所需的最小样本量和可以检查的最大候选预测因子数量。我们进行了一项广泛的模拟研究,研究了 EPV、事件比例、候选预测因子数量、候选预测因子变量的相关性和分布、ROC 曲线下面积以及预测因子对预测模型的样本外预测性能的影响。在进行回归收缩和变量选择之前和之后,我们研究了开发的预测模型的样本外性能(校准、区分和概率预测误差)。结果表明,EPV 与预测性能指标没有很强的关系,并且不是(二元)预测模型开发研究的合适标准。我们表明,可以通过考虑预测因子的数量、总样本量和事件比例更好地近似样本外预测性能。我们建议新的预测模型样本量标准应基于这三个参数,并提供改进样本量确定的建议。