Suppr超能文献

[对响应变量存在缺失数据的纵向高斯数据的分析]

[Analysis of longitudinal Gaussian data with missing data on the response variable].

作者信息

Jacqmin-Gadda H, Commenges D, Dartigues J

机构信息

INSERM U330, 146 rue Léo Saignat, 33076 Bordeaux Cedex.

出版信息

Rev Epidemiol Sante Publique. 1999 Dec;47(6):525-34.

Abstract

BACKGROUND

Using an application and a simulation study we show the bias induced by missing data in the outcome in longitudinal studies and discuss suitable statistical methods according to the type of missing responses when the variable under study is gaussian.

METHOD

The model used for the analysis of gaussian longitudinal data is the mixed effects linear model. When the probability of response does not depend on the missing values of the outcome and on the parameters of the linear model, missing data are ignorable, and parameters of the mixed effects linear model may be estimated by the maximum likelihood method with classical softwares. When the missing data are non ignorable, several methods have been proposed. We describe the method proposed by Diggle and Kenward (1994) (DK method) for which a software is available. This model consists in the combination of a linear mixed effects model for the outcome variable and a logistic model for the probability of response which depends on the outcome variable.

RESULTS

A simulation study shows the efficacy of this method and its limits when the data are not normal. In this case, estimators obtained by the DK approach may be more biased than estimators obtained under the hypothesis of ignorable missing data even if the data are non ignorable. Data of the Paquid cohort about the evolution of the scores to a neuropsychological test among elderly subjects show the bias of a naive analysis using all available data. Although missing responses are not ignorable in this study, estimates of the linear mixed effects model are not very different using the DK approach and the hypothesis of ignorable missing data.

CONCLUSION

Statistical methods for longitudinal data including non ignorable missing responses are sensitive to hypotheses difficult to verify. Thus, it will be better in practical applications to perform an analysis under the hypothesis of ignorable missing responses and compare the results obtained with several approaches for non ignorable missing data. However, such a strategy requires development of new softwares.

摘要

背景

通过一项应用研究和一项模拟研究,我们展示了纵向研究中结果变量缺失数据所导致的偏差,并在研究变量为高斯分布时,根据缺失响应的类型讨论了合适的统计方法。

方法

用于分析高斯纵向数据的模型是混合效应线性模型。当响应概率不依赖于结果变量的缺失值和线性模型的参数时,缺失数据是可忽略的,混合效应线性模型的参数可以用经典软件通过最大似然法进行估计。当缺失数据不可忽略时,已经提出了几种方法。我们描述了由迪格勒和肯沃德(1994年)提出的方法(DK方法),并且有该方法的软件可用。该模型由结果变量的线性混合效应模型和依赖于结果变量的响应概率的逻辑模型组合而成。

结果

一项模拟研究显示了该方法的有效性及其在数据非正态时的局限性。在这种情况下,即使数据不可忽略,通过DK方法获得的估计量可能比在可忽略缺失数据假设下获得的估计量偏差更大。Paquid队列中关于老年受试者神经心理测试分数演变的数据显示了使用所有可用数据进行简单分析时的偏差。尽管在本研究中缺失响应不可忽略,但使用DK方法和可忽略缺失数据假设时,线性混合效应模型的估计结果并没有太大差异。

结论

包括不可忽略缺失响应的纵向数据统计方法对难以验证的假设很敏感。因此,在实际应用中,最好在可忽略缺失响应的假设下进行分析,并将所得结果与几种处理不可忽略缺失数据的方法进行比较。然而,这种策略需要开发新的软件。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验