Kang Joonsung
Department of Information Statistics, Gangneung-Wonju National University, Gangneung, Republic of Korea.
J Appl Stat. 2019 Jul 1;47(1):150-159. doi: 10.1080/02664763.2019.1635573. eCollection 2020.
Generalized linear mixed models have been widely used in the analysis of correlated data in a lot of research areas. The linear mixed model with normal errors has been a popular model for the analysis of repeated measures and longitudinal data. Outliers, however, can severely have an wrong influence on the linear mixed model. The aforementioned model has not fully taken those severe outliers into consideration. One of the popular robust estimation methods, M-estimator attains robustness at the expense of first-order or second-order efficiency whereas minimum Hellinger distance estimator is efficient and robust. In this paper, we propose more robust Bayesian version of parameter estimation via pseudo posterior distribution based on minimum Hellinger distance. It accommodates an appropriate nonparametric kernel density estimation for longitudinal data to require the proposed cross-validation estimator. We conduct simulation study and real data study with the orthodontic study data and the Alzheimers Disease (AD) study data. In simulation study, the proposed method shows smaller biases, mean squared errors, and standard errors than the (residual) maximum likelihood method (REML) in the presence of outliers or missing values. In real data analysis, standard errors and variance-covariance components for the proposed method in two data sets are shown to be lower than those for REML method.
广义线性混合模型已广泛应用于许多研究领域的相关数据分析中。具有正态误差的线性混合模型一直是分析重复测量数据和纵向数据的常用模型。然而,异常值可能会对线性混合模型产生严重的错误影响。上述模型尚未充分考虑这些严重的异常值。流行的稳健估计方法之一,M估计器以牺牲一阶或二阶效率为代价获得稳健性,而最小Hellinger距离估计器既有效又稳健。在本文中,我们基于最小Hellinger距离通过伪后验分布提出了更稳健的贝叶斯参数估计版本。它为纵向数据采用了适当的非参数核密度估计,以得到所提出的交叉验证估计器。我们使用正畸研究数据和阿尔茨海默病(AD)研究数据进行了模拟研究和实际数据研究。在模拟研究中,在存在异常值或缺失值的情况下,所提出的方法比(残差)最大似然法(REML)显示出更小的偏差、均方误差和标准误差。在实际数据分析中,两个数据集中所提出方法的标准误差和方差协方差分量均低于REML方法。