Wang Ming, Kong Lan, Li Zheng, Zhang Lijun
Division of Biostatistics and Bioinformatics, Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, U.S.A.
Department of Biochemistry and Molecular Biology, Penn State College of Medicine, Hershey, PA, U.S.A.
Stat Med. 2016 May 10;35(10):1706-21. doi: 10.1002/sim.6817. Epub 2015 Nov 19.
Generalized estimating equations (GEE) is a general statistical method to fit marginal models for longitudinal data in biomedical studies. The variance-covariance matrix of the regression parameter coefficients is usually estimated by a robust "sandwich" variance estimator, which does not perform satisfactorily when the sample size is small. To reduce the downward bias and improve the efficiency, several modified variance estimators have been proposed for bias-correction or efficiency improvement. In this paper, we provide a comprehensive review on recent developments of modified variance estimators and compare their small-sample performance theoretically and numerically through simulation and real data examples. In particular, Wald tests and t-tests based on different variance estimators are used for hypothesis testing, and the guideline on appropriate sample sizes for each estimator is provided for preserving type I error in general cases based on numerical results. Moreover, we develop a user-friendly R package "geesmv" incorporating all of these variance estimators for public usage in practice.
广义估计方程(GEE)是一种用于拟合生物医学研究中纵向数据边际模型的通用统计方法。回归参数系数的方差 - 协方差矩阵通常由稳健的“三明治”方差估计器估计,当样本量较小时,该估计器的表现并不令人满意。为了减少向下偏差并提高效率,已经提出了几种改进的方差估计器用于偏差校正或效率提升。在本文中,我们对改进方差估计器的最新进展进行了全面综述,并通过模拟和实际数据示例在理论和数值上比较了它们的小样本性能。特别是,基于不同方差估计器的Wald检验和t检验用于假设检验,并根据数值结果为每种估计器在一般情况下保持I型错误提供了适当样本量的指导方针。此外,我们开发了一个用户友好的R包“geesmv”,将所有这些方差估计器整合在一起以供实际公共使用。