Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, Tennessee, USA.
Stat Med. 2021 Apr 15;40(8):1863-1876. doi: 10.1002/sim.8876. Epub 2021 Jan 13.
Two-phase outcome-dependent sampling (ODS) designs are useful when resource constraints prohibit expensive exposure ascertainment on all study subjects. One class of ODS designs for longitudinal binary data stratifies subjects into three strata according to those who experience the event at none, some, or all follow-up times. For time-varying covariate effects, exclusively selecting subjects with response variation can yield highly efficient estimates. However, if interest lies in the association of a time-invariant covariate, or the joint associations of time-varying and time-invariant covariates with the outcome, then the optimal design is unknown. Therefore, we propose a class of two-wave two-phase ODS designs for longitudinal binary data. We split the second-phase sample selection into two waves, between which an interim design evaluation analysis is conducted. The interim design evaluation analysis uses first-wave data to conduct a simulation-based search for the optimal second-wave design that will improve the likelihood of study success. Although we focus on longitudinal binary response data, the proposed design is general and can be applied to other response distributions. We believe that the proposed designs can be useful in settings where (1) the expected second-phase sample size is fixed and one must tailor stratum-specific sampling probabilities to maximize estimation efficiency, or (2) relative sampling probabilities are fixed across sampling strata and one must tailor sample size to achieve a desired precision. We describe the class of designs, examine finite sampling operating characteristics, and apply the designs to an exemplar longitudinal cohort study, the Lung Health Study.
两阶段基于结果的抽样(ODS)设计在资源有限、无法对所有研究对象进行昂贵的暴露确认时非常有用。一类用于纵向二分类数据的 ODS 设计根据那些在所有随访时间、部分随访时间或无随访时间经历事件的人将研究对象分为三个亚组。对于时变协变量效应,仅选择具有反应变异的对象可以产生高效的估计值。然而,如果关注的是时不变协变量的相关性,或者是时变和时不变协变量与结果的联合相关性,那么最优设计是未知的。因此,我们提出了一类用于纵向二分类数据的两波两阶段 ODS 设计。我们将第二阶段的样本选择分为两波,在这两波之间进行中期设计评估分析。中期设计评估分析使用第一波数据进行基于模拟的搜索,以找到能够提高研究成功概率的最优第二波设计。尽管我们专注于纵向二分类响应数据,但所提出的设计具有通用性,可以应用于其他响应分布。我们认为,在所提出的设计可以在以下情况下有用:(1)预期的第二阶段样本量固定,必须调整特定亚组的抽样概率以最大化估计效率;(2)跨抽样亚组的相对抽样概率固定,必须调整样本量以达到预期的精度。我们描述了设计类,检查了有限抽样操作特性,并将设计应用于一个纵向队列研究的示例,即肺健康研究。