Centre for Infectious Disease Epidemiology & Research, University of Cape Town, Falmouth Building, Observatory, Cape Town, 7925, South Africa.
Christian Heumann, Institut für Statistik, Ludwig-Maximilians Universität München, München, Germany.
Stat Med. 2018 Jun 30;37(14):2252-2266. doi: 10.1002/sim.7654. Epub 2018 Apr 16.
Many modern estimators require bootstrapping to calculate confidence intervals because either no analytic standard error is available or the distribution of the parameter of interest is nonsymmetric. It remains however unclear how to obtain valid bootstrap inference when dealing with multiple imputation to address missing data. We present 4 methods that are intuitively appealing, easy to implement, and combine bootstrap estimation with multiple imputation. We show that 3 of the 4 approaches yield valid inference, but that the performance of the methods varies with respect to the number of imputed data sets and the extent of missingness. Simulation studies reveal the behavior of our approaches in finite samples. A topical analysis from HIV treatment research, which determines the optimal timing of antiretroviral treatment initiation in young children, demonstrates the practical implications of the 4 methods in a sophisticated and realistic setting. This analysis suffers from missing data and uses the g-formula for inference, a method for which no standard errors are available.
许多现代估计量需要进行自举法来计算置信区间,因为要么没有可用的解析标准误差,要么感兴趣的参数的分布是非对称的。然而,当涉及到多重插补来处理缺失数据时,如何获得有效的自举推断仍然不清楚。我们提出了 4 种方法,这些方法直观上很有吸引力,易于实现,并将自举估计与多重插补相结合。我们表明,这 4 种方法中的 3 种方法可以得出有效的推断,但这些方法的性能因插补数据集的数量和缺失程度的不同而有所不同。模拟研究揭示了我们方法在有限样本中的行为。来自 HIV 治疗研究的一个主题分析,确定了在年幼儿童中开始抗逆转录病毒治疗的最佳时机,在一个复杂和现实的环境中展示了这 4 种方法的实际意义。该分析存在缺失数据,并使用 g 公式进行推断,该方法没有可用的标准误差。