Schildcrout Jonathan S, Heagerty Patrick J
Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, TN 37232-2158, USA.
Biostatistics. 2008 Oct;9(4):735-49. doi: 10.1093/biostatistics/kxn006. Epub 2008 Mar 27.
A typical longitudinal study prospectively collects both repeated measures of a health status outcome as well as covariates that are used either as the primary predictor of interest or as important adjustment factors. In many situations, all covariates are measured on the entire study cohort. However, in some scenarios the primary covariates are time dependent yet may be ascertained retrospectively after completion of the study. One common example would be covariate measurements based on stored biological specimens such as blood plasma. While authors have previously proposed generalizations of the standard case-control design in which the clustered outcome measurements are used to selectively ascertain covariates (Neuhaus and Jewell, 1990) and therefore provide resource efficient collection of information, these designs do not appear to be commonly used. One potential barrier to the use of longitudinal outcome-dependent sampling designs would be the lack of a flexible class of likelihood-based analysis methods. With the relatively recent development of flexible and practical methods such as generalized linear mixed models (Breslow and Clayton, 1993) and marginalized models for categorical longitudinal data (see Heagerty and Zeger, 2000, for an overview), the class of likelihood-based methods is now sufficiently well developed to capture the major forms of longitudinal correlation found in biomedical repeated measures data. Therefore, the goal of this manuscript is to promote the consideration of outcome-dependent longitudinal sampling designs and to both outline and evaluate the basic conditional likelihood analysis allowing for valid statistical inference.
典型的纵向研究前瞻性地收集健康状况结果的重复测量值以及用作主要预测指标或重要调整因素的协变量。在许多情况下,所有协变量都是在整个研究队列中进行测量的。然而,在某些情况下,主要协变量是随时间变化的,但可能在研究完成后进行回顾性确定。一个常见的例子是基于储存的生物标本(如血浆)进行的协变量测量。虽然作者之前提出了标准病例对照设计的推广方法,其中聚类结果测量用于选择性地确定协变量(Neuhaus和Jewell,1990),从而提供资源高效的信息收集,但这些设计似乎并不常用。使用纵向结果依赖抽样设计的一个潜在障碍可能是缺乏一类灵活的基于似然性的分析方法。随着广义线性混合模型(Breslow和Clayton,1993)以及分类纵向数据的边缘化模型(见Heagerty和Zeger,2000年的综述)等灵活实用方法的相对较新发展,基于似然性的方法类别现在已经足够完善,能够捕捉生物医学重复测量数据中发现的主要纵向相关性形式。因此,本手稿的目标是促进对结果依赖纵向抽样设计的考虑,并概述和评估允许进行有效统计推断的基本条件似然性分析。