Aitkin M
Department of Statistics, University of Newcastle, UK.
Biometrics. 1999 Mar;55(1):117-28. doi: 10.1111/j.0006-341x.1999.00117.x.
This paper describes an EM algorithm for nonparametric maximum likelihood (ML) estimation in generalized linear models with variance component structure. The algorithm provides an alternative analysis to approximate MQL and PQL analyses (McGilchrist and Aisbett, 1991, Biometrical Journal 33, 131-141; Breslow and Clayton, 1993; Journal of the American Statistical Association 88, 9-25; McGilchrist, 1994, Journal of the Royal Statistical Society, Series B 56, 61-69; Goldstein, 1995, Multilevel Statistical Models) and to GEE analyses (Liang and Zeger, 1986, Biometrika 73, 13-22). The algorithm, first given by Hinde and Wood (1987, in Longitudinal Data Analysis, 110-126), is a generalization of that for random effect models for overdispersion in generalized linear models, described in Aitkin (1996, Statistics and Computing 6, 251-262). The algorithm is initially derived as a form of Gaussian quadrature assuming a normal mixing distribution, but with only slight variation it can be used for a completely unknown mixing distribution, giving a straightforward method for the fully nonparametric ML estimation of this distribution. This is of value because the ML estimates of the GLM parameters can be sensitive to the specification of a parametric form for the mixing distribution. The nonparametric analysis can be extended straightforwardly to general random parameter models, with full NPML estimation of the joint distribution of the random parameters. This can produce substantial computational saving compared with full numerical integration over a specified parametric distribution for the random parameters. A simple method is described for obtaining correct standard errors for parameter estimates when using the EM algorithm. Several examples are discussed involving simple variance component and longitudinal models, and small-area estimation.
本文描述了一种用于具有方差分量结构的广义线性模型中非参数最大似然(ML)估计的期望最大化(EM)算法。该算法为近似边际拟似然(MQL)和惩罚拟似然(PQL)分析(McGilchrist和Aisbett,1991年,《生物计量学杂志》33卷,131 - 141页;Breslow和Clayton,1993年;《美国统计协会杂志》88卷,9 - 25页;McGilchrist,1994年,《皇家统计学会会刊》,B辑56卷,61 - 69页;Goldstein,1995年,《多层统计模型》)以及广义估计方程(GEE)分析(Liang和Zeger,1986年,《生物计量学》73卷,13 - 22页)提供了一种替代分析方法。该算法最初由Hinde和Wood(1987年,《纵向数据分析》,110 - 126页)给出,是Aitkin(1996年,《统计与计算》6卷,251 - 262页)中描述的广义线性模型中用于过度分散的随机效应模型算法的推广。该算法最初是作为一种高斯求积形式推导出来的,假设混合分布为正态分布,但只需稍作变动,它就可用于完全未知的混合分布,从而给出一种用于该分布完全非参数ML估计的直接方法。这很有价值,因为广义线性模型参数的ML估计可能对混合分布的参数形式设定很敏感。非参数分析可直接扩展到一般随机参数模型,对随机参数的联合分布进行完全非参数最大似然估计。与对随机参数在指定参数分布上进行完全数值积分相比,这可大幅节省计算量。文中描述了一种在使用EM算法时获取参数估计正确标准误差的简单方法。讨论了几个涉及简单方差分量和纵向模型以及小区域估计问题的例子。