Salazar Alejandro, Ojeda Begoña, Dueñas María, Fernández Fernando, Failde Inmaculada
Department of Biomedicine, Biotechnology and Public Health, University of Cádiz, Cádiz, Spain.
Department of Statistics and Operational Research, University of Cádiz, Cádiz, Spain.
Stat Med. 2016 Aug 30;35(19):3424-48. doi: 10.1002/sim.6947. Epub 2016 Apr 5.
Missing data are a common problem in clinical and epidemiological research, especially in longitudinal studies. Despite many methodological advances in recent decades, many papers on clinical trials and epidemiological studies do not report using principled statistical methods to accommodate missing data or use ineffective or inappropriate techniques. Two refined techniques are presented here: generalized estimating equations (GEEs) and weighted generalized estimating equations (WGEEs). These techniques are an extension of generalized linear models to longitudinal or clustered data, where observations are no longer independent. They can appropriately handle missing data when the missingness is completely at random (GEE and WGEE) or at random (WGEE) and do not require the outcome to be normally distributed. Our aim is to describe and illustrate with a real example, in a simple and accessible way to researchers, these techniques for handling missing data in the context of longitudinal studies subject to dropout and show how to implement them in R. We apply them to assess the evolution of health-related quality of life in coronary patients in a data set subject to dropout. Copyright © 2016 John Wiley & Sons, Ltd.
缺失数据是临床和流行病学研究中的常见问题,在纵向研究中尤为如此。尽管近几十年来在方法学上取得了许多进展,但许多关于临床试验和流行病学研究的论文并未报告使用有原则的统计方法来处理缺失数据,或者使用了无效或不恰当的技术。本文介绍了两种改进技术:广义估计方程(GEE)和加权广义估计方程(WGEE)。这些技术是广义线性模型向纵向或聚类数据的扩展,其中观测值不再独立。当缺失完全随机(GEE和WGEE)或随机(WGEE)时,它们可以适当地处理缺失数据,并且不要求结果呈正态分布。我们的目的是以简单易懂的方式,通过一个实际例子向研究人员描述和说明这些在存在失访的纵向研究背景下处理缺失数据的技术,并展示如何在R中实现它们。我们将它们应用于评估一个存在失访的数据集中心脏病患者健康相关生活质量的演变。版权所有© 2016约翰威立父子有限公司。