1 School of Mathematical Sciences, The University of Adelaide, Adelaide, Australia.
2 School of Public Health, The University of Adelaide, Adelaide, Australia.
Stat Methods Med Res. 2018 Oct;27(10):2918-2932. doi: 10.1177/0962280216689580. Epub 2017 Jan 18.
One purpose of a longitudinal study is to gain insight of how characteristics at earlier points in time can impact on subsequent outcomes. Typically, the outcome variable varies over time and the data for each individual can be used to form a discrete path of measurements, that is a trajectory. Group-based trajectory modelling methods seek to identify subgroups of individuals within a population with trajectories that are more similar to each other than to trajectories in distinct groups. An approach to modelling the influence of covariates measured at earlier time points in the group-based setting is to consider models wherein these covariates affect the group membership probabilities. Models in which prior covariates impact the trajectories directly are also possible but are not considered here. In the present study, we compared six different methods for estimating the effect of covariates on the group membership probabilities, which have different approaches to account for the uncertainty in the group membership assignment. We found that when investigating the effect of one or several covariates on a group-based trajectory model, the full likelihood approach minimized the bias in the estimate of the covariate effect. In this '1-step' approach, the estimation of the effect of covariates and the trajectory model are carried out simultaneously. Of the '3-step' approaches, where the effect of the covariates is assessed subsequent to the estimation of the group-based trajectory model, only Vermunt's improved 3 step resulted in bias estimates similar in size to the full likelihood approach. The remaining methods considered resulted in considerably higher bias in the covariate effect estimates and should not be used. In addition to the bias empirically demonstrated for the probability regression approach, we have shown analytically that it is biased in general.
纵向研究的目的之一是深入了解早期特征如何影响后续结果。通常情况下,因变量会随时间变化,并且可以利用每个个体的数据形成离散的测量路径,即轨迹。基于群组的轨迹建模方法旨在识别人群中具有彼此之间更相似轨迹的亚组,而不是具有不同轨迹的群组。在基于群组的环境中,对在较早时间点测量的协变量进行建模的一种方法是考虑模型,其中这些协变量影响群组成员概率。直接影响轨迹的先前协变量的模型也是可能的,但这里不考虑这些模型。在本研究中,我们比较了六种不同的方法来估计协变量对群组成员概率的影响,这些方法在考虑群组成员分配的不确定性方面有不同的方法。我们发现,在研究一个或多个协变量对基于群组的轨迹模型的影响时,全似然方法最小化了协变量效应估计的偏差。在这种“一步”方法中,同时进行协变量效应的估计和轨迹模型的估计。在“三步”方法中,在估计基于群组的轨迹模型之后评估协变量的影响,只有 Vermunt 的改进的 3 步方法导致的偏倚估计与全似然方法相似。其余考虑的方法导致协变量效应估计的偏差大得多,不应使用。除了概率回归方法在实践中证明的偏差外,我们还从理论上证明了它通常是有偏差的。