Condon J, Kelly G, Bradshaw B, Leonard N
Department of Applied Mathematics and Theoretical Physics, The Queen's University of Belfast, Belfast BT7 1NN, Northern Ireland, UK.
Prev Vet Med. 2004 Jun 10;64(1):1-14. doi: 10.1016/j.prevetmed.2004.03.003.
Infection prevalence in a population often is estimated from grouped binary data expressed as proportions. The groups can be families, herds, flocks, farms, etc. The observed number of cases generally is assumed to have a Binomial distribution and the estimate of prevalence is then the sample proportion of cases. However, the individual binary observations might not be independent--leading to overdispersion. The goal of this paper was to demonstrate random-effects models for the estimation of infection prevalence from data which are correlated and in particular, to illustrate a nonparametric random-effects model for this purpose. The nonparametric approach is a relatively recent addition to the random-effects class of models and does not appear to have been discussed previously in the veterinary epidemiology literature. The assumptions for a logistic-regression model with a nonparametric random effect were outlined. In a demonstration of the method on data relating to Salmonella infection in Irish pig herds, the nonparametric method resulted in the classification of herds into a small number of distinct prevalence groups (i.e. low, medium and high prevalence) and also estimated the relative frequency of each prevalence category in the population. We compared the estimates from a logistic model with a nonparametric distribution for the random effects with four alternative models: a logistic-regression model with no random effects, a marginal model using a generalised estimating equation (GEE) and two methods of fitting a Normally distributed random effect (the GLIMMIX macro and the NLMIXED procedure both in SAS). Parameter estimates from random-effects models are not readily interpretable in terms of prevalences. Therefore, we outlined two methods for calculating population-averaged estimates of prevalence from random-effects models: one using numerical integration and the other using Monte Carlo simulation.
人群中的感染流行率通常根据以比例表示的分组二元数据进行估计。这些组可以是家庭、畜群、禽群、农场等。通常假设观察到的病例数服从二项分布,然后流行率的估计值就是病例的样本比例。然而,个体二元观察值可能并非相互独立,从而导致过度离散。本文的目的是展示用于从相关数据估计感染流行率的随机效应模型,特别是说明用于此目的的非参数随机效应模型。非参数方法是随机效应模型类中相对较新的补充,在兽医流行病学文献中似乎尚未有过相关讨论。概述了具有非参数随机效应的逻辑回归模型的假设。在对爱尔兰猪群沙门氏菌感染相关数据的方法演示中,非参数方法将猪群分为少数几个不同的流行率组(即低、中、高流行率),并估计了总体中每个流行率类别的相对频率。我们将具有非参数随机效应分布的逻辑模型的估计值与四个替代模型进行了比较:无随机效应的逻辑回归模型、使用广义估计方程(GEE)的边际模型以及两种拟合正态分布随机效应的方法(SAS中的GLIMMIX宏和NLMIXED过程)。随机效应模型的参数估计值难以从流行率角度进行解释。因此,我们概述了两种从随机效应模型计算总体平均流行率估计值的方法:一种使用数值积分,另一种使用蒙特卡罗模拟。