Westgate Philip M, West Brady T
Department of Biostatistics, College of Public Health, University of Kentucky, Lexington, KY 40536, U.S.A.
Survey Research Center, Institute for Social Research, University of Michigan-Ann Arbor, Ann Arbor, MI 48109, U.S.A.
J Surv Stat Methodol. 2021 Feb;9(1):141-158. doi: 10.1093/jssam/smz048. Epub 2020 Feb 17.
Weighted generalized estimating equations (GEE) are popular for the marginal analysis of longitudinal survey data. This popularity is due to the ability of these estimating equations to provide consistent regression parameter estimates and corresponding standard error estimates as long as the population mean and survey weights are correctly specified. Although the data analyst must incorporate a working correlation structure within the weighted GEE, this structure need not be correctly specified. However, accurate modeling of this structure has the potential to improve regression parameter estimation, i.e. reduce standard errors, and therefore the selection of a working correlation structure for use within GEE has received considerable attention in standard longitudinal data analysis settings. In this manuscript, we describe how correlation selection criteria can be extended for use with weighted GEE in the context of analyzing longitudinal survey data. Importantly, we provide and demonstrate an R function that we have created for such analyses. Furthermore, we discuss correlation selection in the context of using existing software which does not have this explicit capability. The methods are demonstrated via the use of data from a real survey in which we are interested in the mean number of falls that elderly individuals in a specific subpopulation experience over time.
加权广义估计方程(GEE)在纵向调查数据的边际分析中很受欢迎。这种受欢迎程度归因于这些估计方程能够在总体均值和调查权重正确指定的情况下,提供一致的回归参数估计值和相应的标准误差估计值。尽管数据分析师必须在加权GEE中纳入一个工作相关结构,但这个结构不必正确指定。然而,对这个结构进行准确建模有可能改善回归参数估计,即减少标准误差,因此在标准纵向数据分析设置中,为GEE选择一个工作相关结构受到了相当多的关注。在本手稿中,我们描述了在分析纵向调查数据的背景下,相关选择标准如何扩展以用于加权GEE。重要的是,我们提供并演示了一个为此类分析创建的R函数。此外,我们在使用没有这种明确功能的现有软件的背景下讨论相关选择。通过使用来自一项实际调查的数据来演示这些方法,在该调查中,我们感兴趣的是特定亚人群中的老年人随着时间推移经历的跌倒平均次数。