Wijesuriya Rushani, Moreno-Betancur Margarita, Carlin John B, White Ian R, Quartagno Matteo, Lee Katherine J
Clinical Epidemiology & Biostatistics (CEBU), Murdoch Children's Research Institute, Parkville, Australia.
Department of Paediatrics, University of Melbourne, Melbourne, Australia.
Stat Med. 2025 Feb 10;44(3-4):e10274. doi: 10.1002/sim.10274.
Longitudinal studies are frequently used in medical research and involve collecting repeated measures on individuals over time. Observations from the same individual are invariably correlated and thus an analytic approach that accounts for this clustering by individual is required. While almost all research suffers from missing data, this can be particularly problematic in longitudinal studies as participation often becomes harder to maintain over time. Multiple imputation (MI) is widely used to handle missing data in such studies. When using MI, it is important that the imputation model is compatible with the proposed analysis model. In a longitudinal analysis, this implies that the clustering considered in the analysis model should be reflected in the imputation process. Several MI approaches have been proposed to impute incomplete longitudinal data, such as treating repeated measurements of the same variable as distinct variables or using generalized linear mixed imputation models. However, the uptake of these methods has been limited, as they require additional data manipulation and use of advanced imputation procedures. In this tutorial, we review the available MI approaches that can be used for handling incomplete longitudinal data, including where individuals are clustered within higher-level clusters. We illustrate implementation with replicable R and Stata code using a case study from the Childhood to Adolescence Transition Study.
纵向研究在医学研究中经常被使用,涉及对个体随时间进行重复测量。来自同一个体的观察结果总是相关的,因此需要一种考虑个体聚类的分析方法。虽然几乎所有研究都存在数据缺失问题,但在纵向研究中这可能尤其成问题,因为随着时间推移,参与往往变得更难维持。多重填补(MI)被广泛用于处理此类研究中的缺失数据。使用MI时,重要的是填补模型要与所提出的分析模型兼容。在纵向分析中,这意味着分析模型中考虑的聚类应在填补过程中得到体现。已经提出了几种MI方法来填补不完整的纵向数据,例如将同一变量的重复测量视为不同变量,或使用广义线性混合填补模型。然而,这些方法的采用受到限制,因为它们需要额外的数据处理和使用先进的填补程序。在本教程中,我们回顾了可用于处理不完整纵向数据的现有MI方法,包括个体在更高级别聚类中的情况。我们使用来自儿童到青少年过渡研究的案例研究,用可复制的R和Stata代码说明实施过程。