Epidemiology. 2018 Jan;29(1):58-66. doi: 10.1097/EDE.0000000000000765.
Epidemiologists have long used case-control and related study designs to enhance variability of response and information available to estimate exposure-disease associations. Less has been done for longitudinal data.
We discuss an epidemiological study design and analysis approach for longitudinal binary response data. We seek to gain statistical efficiency by oversampling relatively informative subjects for inclusion into the sample. In this methodological demonstration, we develop this concept by sampling repeatedly from an existing cohort study to estimate the relationship of chronic obstructive pulmonary disease to past-year smoking in a panel of baseline smokers. To account for oversampling, we describe a sequential offsetted regressions approach for valid inferences in this setting.
Targeted sampling can lead to increased statistical efficiency when combined with sequential offsetted regressions. Efficiency gains are degraded with increased prevalence of the disease response variable, with decreased association between the sampling variable and the response, and with other design and analysis parameters, providing guidance to those wishing to use these types of designs in the future.
These designs hold promise for efficient use of resources in longitudinal cohort studies.
长期以来,流行病学家一直使用病例对照和相关研究设计来增强反应的可变性和提供更多信息,以估计暴露与疾病之间的关联。对于纵向数据,这方面的工作做得较少。
我们讨论了一种用于纵向二分类反应数据的流行病学研究设计和分析方法。我们试图通过对相对信息丰富的受试者进行过采样,来提高统计效率,以便将其纳入样本。在这个方法演示中,我们通过从现有的队列研究中重复抽样,来估计在一组基线吸烟者中,慢性阻塞性肺疾病与过去一年吸烟的关系。为了考虑过采样,我们描述了一种顺序偏移回归方法,以便在这种情况下进行有效的推断。
当与顺序偏移回归相结合时,目标采样可以提高统计效率。随着疾病反应变量的流行率增加、采样变量与反应之间的相关性降低以及其他设计和分析参数的变化,效率增益会降低,为那些希望将来使用这些类型设计的人提供了指导。
这些设计为在纵向队列研究中有效地利用资源提供了希望。