Biostatistics and Epidemiology/Research Design Component, Division of Clinical and Translational Sciences, Department of Internal Medicine, University of Texas McGovern Medical School, Houston, Texas, USA.
Division of Biostatistics and Epidemiology, Cincinnati Children's Hospital, Cincinnati, Ohio, USA.
Stat Med. 2021 Mar 30;40(7):1845-1858. doi: 10.1002/sim.8875. Epub 2021 Jan 10.
A frequent problem in longitudinal studies is that data may be assessed at subject-selected, irregularly spaced time-points, resulting in highly unbalanced outcome data, inducing bias, especially if availability of data is directly related to outcome. Our aim was to develop a multivariate joint model in a mixed outcomes framework to minimize irregular sampling bias. We demonstrate using blood glucose monitoring throughout pregnancy and risk of preterm birth among women with type 1 diabetes mellitus. Blood glucose measurements were unequally spaced and intensity of sampling varied between and within individuals over time. Multivariate linear mixed effects submodel for the longitudinal outcome (blood glucose), Poisson model for the intensity of glucose sampling, and logistic regression model for binary process (preterm birth) were specified. Association between models is captured through shared random effects. Markov chain Monte Carlo methods were used to fit the model. The multivariate joint model provided better prediction, compared with a joint model with a multivariate linear mixed effects submodel (ignoring intensity of glucose sampling) and a two-stage model. Most association parameters were significant in the preterm birth outcome model, signifying improvement of predictive ability of the binary endpoint by sharing random effects between glucose monitoring and preterm birth. A simulation study is presented to illustrate the effectiveness of the multivariate joint modeling approach.
在纵向研究中,一个常见的问题是数据可能是在受试者选择的、不规则间隔的时间点进行评估的,这导致结果数据极不平衡,从而产生偏差,特别是如果数据的可用性与结果直接相关。我们的目的是在混合结果框架中开发一个多变量联合模型,以最小化不规则采样偏差。我们使用整个怀孕期间的血糖监测和 1 型糖尿病患者早产的风险来演示。血糖测量的间隔不均匀,随着时间的推移,个体之间和个体内部的采样强度也有所不同。指定了用于纵向结果(血糖)的多变量线性混合效应子模型、用于葡萄糖采样强度的泊松模型以及用于二项过程(早产)的逻辑回归模型。通过共享随机效应来捕捉模型之间的关联。使用马尔可夫链蒙特卡罗方法拟合模型。与仅具有多变量线性混合效应子模型(忽略葡萄糖采样强度)和两阶段模型的联合模型相比,多变量联合模型提供了更好的预测。早产结局模型中的大多数关联参数均具有统计学意义,这表明通过在血糖监测和早产之间共享随机效应,可以提高二项终点的预测能力。提出了一项模拟研究来说明多变量联合建模方法的有效性。