Section for Clinical Biometrics, Center for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Spitalgasse 23, A-1090 Vienna, Austria.
Stat Med. 2013 Dec 20;32(29):5062-76. doi: 10.1002/sim.5899. Epub 2013 Jul 19.
In the logistic regression analysis of a small-sized, case-control study on Alzheimer's disease, some of the risk factors exhibited missing values, motivating the use of multiple imputation. Usually, Rubin's rules (RR) for combining point estimates and variances would then be used to estimate (symmetric) confidence intervals (CIs), on the assumption that the regression coefficients were distributed normally. Yet, rarely is this assumption tested, with or without transformation. In analyses of small, sparse, or nearly separated data sets, such symmetric CI may not be reliable. Thus, RR alternatives have been considered, for example, Bayesian sampling methods, but not yet those that combine profile likelihoods, particularly penalized profile likelihoods, which can remove first order biases and guarantee convergence of parameter estimation. To fill the gap, we consider the combination of penalized likelihood profiles (CLIP) by expressing them as posterior cumulative distribution functions (CDFs) obtained via a chi-squared approximation to the penalized likelihood ratio statistic. CDFs from multiple imputations can then easily be averaged into a combined CDF c , allowing confidence limits for a parameter β at level 1 - α to be identified as those β* and β** that satisfy CDF c (β*) = α ∕ 2 and CDF c (β**) = 1 - α ∕ 2. We demonstrate that the CLIP method outperforms RR in analyzing both simulated data and data from our motivating example. CLIP can also be useful as a confirmatory tool, should it show that the simpler RR are adequate for extended analysis. We also compare the performance of CLIP to Bayesian sampling methods using Markov chain Monte Carlo. CLIP is available in the R package logistf.
在一项针对阿尔茨海默病的小型病例对照研究的逻辑回归分析中,一些风险因素存在缺失值,这促使我们使用多重插补法。通常,会使用 Rubin 的规则(RR)来组合点估计值和方差,从而估计(对称)置信区间(CI),前提是回归系数呈正态分布。然而,很少有研究检验这种假设,无论是进行转换前还是转换后。在小型、稀疏或几乎分离的数据集中,这种对称 CI 可能不可靠。因此,已经考虑了 RR 的替代方法,例如贝叶斯抽样方法,但尚未考虑结合轮廓似然的替代方法,特别是惩罚轮廓似然,它可以消除一阶偏差并保证参数估计的收敛性。为了填补这一空白,我们考虑通过将惩罚似然轮廓表示为通过对惩罚似然比统计量进行卡方逼近获得的后验累积分布函数(CDF)来组合惩罚似然轮廓(CLIP)。然后,可以轻松地将多个插补的 CDF 平均为组合 CDF c,允许将参数β的置信限识别为满足 CDF c(β*)=α∕2和 CDF c(β**)=1−α∕2 的β和β*。我们证明,CLIP 方法在分析模拟数据和我们的启发式示例数据方面都优于 RR。如果 CLIP 表明更简单的 RR 足以进行扩展分析,那么它也可以作为一种确认工具。我们还使用马尔可夫链蒙特卡罗比较了 CLIP 与贝叶斯抽样方法的性能。CLIP 可在 R 包 logistf 中获得。