Wahl Simone, Boulesteix Anne-Laure, Zierer Astrid, Thorand Barbara, van de Wiel Mark A
Research Unit of Molecular Epidemiology, Helmholtz Zentrum München - German Research Center for Environmental Health, Ingolstädter Landstrasse, Neuherberg, 1, 85764, Germany.
Institute of Epidemiology II, Helmholtz Zentrum München - German Research Center for Environmental Health, Ingolstädter Landstrasse, Neuherberg, 1, 85764, Germany.
BMC Med Res Methodol. 2016 Oct 26;16(1):144. doi: 10.1186/s12874-016-0239-7.
Missing values are a frequent issue in human studies. In many situations, multiple imputation (MI) is an appropriate missing data handling strategy, whereby missing values are imputed multiple times, the analysis is performed in every imputed data set, and the obtained estimates are pooled. If the aim is to estimate (added) predictive performance measures, such as (change in) the area under the receiver-operating characteristic curve (AUC), internal validation strategies become desirable in order to correct for optimism. It is not fully understood how internal validation should be combined with multiple imputation.
In a comprehensive simulation study and in a real data set based on blood markers as predictors for mortality, we compare three combination strategies: Val-MI, internal validation followed by MI on the training and test parts separately, MI-Val, MI on the full data set followed by internal validation, and MI(-y)-Val, MI on the full data set omitting the outcome followed by internal validation. Different validation strategies, including bootstrap und cross-validation, different (added) performance measures, and various data characteristics are considered, and the strategies are evaluated with regard to bias and mean squared error of the obtained performance estimates. In addition, we elaborate on the number of resamples and imputations to be used, and adopt a strategy for confidence interval construction to incomplete data.
Internal validation is essential in order to avoid optimism, with the bootstrap 0.632+ estimate representing a reliable method to correct for optimism. While estimates obtained by MI-Val are optimistically biased, those obtained by MI(-y)-Val tend to be pessimistic in the presence of a true underlying effect. Val-MI provides largely unbiased estimates, with a slight pessimistic bias with increasing true effect size, number of covariates and decreasing sample size. In Val-MI, accuracy of the estimate is more strongly improved by increasing the number of bootstrap draws rather than the number of imputations. With a simple integrated approach, valid confidence intervals for performance estimates can be obtained.
When prognostic models are developed on incomplete data, Val-MI represents a valid strategy to obtain estimates of predictive performance measures.
缺失值在人体研究中是一个常见问题。在许多情况下,多重填补(MI)是一种合适的缺失数据处理策略,即对缺失值进行多次填补,在每个填补数据集上进行分析,并汇总得到的估计值。如果目标是估计(增加的)预测性能指标,如受试者工作特征曲线下面积(AUC)的(变化),则需要采用内部验证策略来校正乐观偏差。目前尚不完全清楚内部验证应如何与多重填补相结合。
在一项全面的模拟研究以及基于血液标志物作为死亡率预测指标的真实数据集中,我们比较了三种组合策略:Val-MI,即先进行内部验证,然后分别在训练集和测试集上进行MI;MI-Val,即在完整数据集上进行MI,然后进行内部验证;以及MI(-y)-Val,即在完整数据集中省略结局变量后进行MI,然后进行内部验证。我们考虑了不同的验证策略,包括自助法和交叉验证、不同的(增加的)性能指标以及各种数据特征,并根据所获得性能估计的偏差和均方误差对这些策略进行评估。此外,我们详细阐述了要使用的重采样次数和填补次数,并采用一种策略为不完整数据构建置信区间。
为避免乐观偏差,内部验证至关重要,自助法0.632+估计值是校正乐观偏差的可靠方法。虽然MI-Val获得的估计值存在乐观偏差,但在存在真实潜在效应的情况下,MI(-y)-Val获得的估计值往往偏于悲观。Val-MI提供的估计值基本无偏差,但随着真实效应大小、协变量数量的增加以及样本量的减少,会有轻微的悲观偏差。在Val-MI中,通过增加自助抽样次数而非填补次数,估计值的准确性能得到更显著提高。采用一种简单的综合方法,可以获得性能估计的有效置信区间。
当在不完整数据上开发预后模型时,Val-MI是获得预测性能指标估计值的有效策略。