Ma Linquan, Liu Lan, Yang Wei
Department of Statistics, University of Wisconsin - Madison, Madison, Wisconsin, USA.
School of Statistics, University of Minnesota at Twin Cities, Minneapolis, Minnesota, USA.
Electron J Stat. 2021;15(2):4420-4461. doi: 10.1214/21-ejs1881. Epub 2021 Sep 14.
Envelope method was recently proposed as a method to reduce the dimension of responses in multivariate regressions. However, when there exists missing data, the envelope method using the complete case observations may lead to biased and inefficient results. In this paper, we generalize the envelope estimation when the predictors and/or the responses are missing at random. Specifically, we incorporate the envelope structure in the expectation-maximization (EM) algorithm. As the parameters under the envelope method are not pointwise identifiable, the EM algorithm for the envelope method was not straightforward and requires a special decomposition. Our method is guaranteed to be more efficient, or at least as efficient as, the standard EM algorithm. Moreover, our method has the potential to outperform the full data MLE. We give asymptotic properties of our method under both normal and non-normal cases. The efficiency gain over the standard EM is confirmed in simulation studies and in an application to the Chronic Renal Insufficiency Cohort (CRIC) study.
包络法最近被提出作为一种在多元回归中降低响应变量维度的方法。然而,当存在缺失数据时,使用完整病例观测值的包络法可能会导致有偏且低效的结果。在本文中,我们将包络估计推广到预测变量和/或响应变量随机缺失的情况。具体而言,我们将包络结构纳入期望最大化(EM)算法中。由于包络法下的参数不是逐点可识别的,包络法的EM算法并不直接,需要特殊的分解。我们的方法保证比标准EM算法更有效,或者至少与标准EM算法一样有效。此外,我们的方法有可能优于全数据极大似然估计。我们给出了我们的方法在正态和非正态情况下的渐近性质。在模拟研究以及对慢性肾功能不全队列(CRIC)研究的应用中,证实了相对于标准EM算法的效率提升。