Lin H, McCulloch C E, Turnbull B W, Slate E H, Clark L C
Department of Statistical Science, Cornell University, Ithaca, NY 14853, USA.
Stat Med. 2000 May 30;19(10):1303-18. doi: 10.1002/(sici)1097-0258(20000530)19:10<1303::aid-sim424>3.0.co;2-e.
This paper considers a latent class model to uncover subpopulation structure for both biomarker trajectories and the probability of disease outcome in highly unbalanced longitudinal data. A specific pattern of trajectories can be viewed as a latent class in a finite mixture where membership in latent classes is modelled with a polychotomous logistic regression. The biomarker trajectories within a latent class are described by a linear mixed model with possibly time-dependent covariates and the probabilities of disease outcome are estimated via a class specific model. Thus the method characterizes biomarker trajectory patterns to unveil the relationship between trajectories and outcomes of disease. The coefficients for the model are estimated via a generalized EM (GEM) algorithm, a natural tool to use when latent classes and random coefficients are present. Standard errors of the coefficients are calculated using a parametric bootstrap. The model fitting procedure is illustrated with data from the Nutritional Prevention of Cancer trials; we use prostate specific antigen (PSA) as the biomarker for prostate cancer and the goal is to examine trajectories of PSA serial readings in individual subjects in connection with incidence of prostate cancer.
本文考虑一种潜在类别模型,以揭示高度不平衡纵向数据中生物标志物轨迹和疾病结局概率的亚群结构。特定的轨迹模式可被视为有限混合模型中的一个潜在类别,其中潜在类别的成员关系通过多分类逻辑回归进行建模。潜在类别中的生物标志物轨迹由一个可能包含时间相依协变量的线性混合模型描述,疾病结局的概率通过特定类别的模型进行估计。因此,该方法通过刻画生物标志物轨迹模式来揭示轨迹与疾病结局之间的关系。模型系数通过广义期望最大化(GEM)算法进行估计,这是在存在潜在类别和随机系数时使用的一种自然工具。系数的标准误使用参数自助法进行计算。使用癌症营养预防试验的数据说明了模型拟合过程;我们将前列腺特异性抗原(PSA)用作前列腺癌的生物标志物,目标是研究个体受试者中PSA系列读数的轨迹与前列腺癌发病率之间的关系。