Department of Medicine, University of Washington, Seattle, WA 98195, USA.
Department of Biostatistics, Vanderbilt School of Medicine, Nashville, TN 37203, USA.
Stat Med. 2018 Jun 15;37(13):2120-2133. doi: 10.1002/sim.7633. Epub 2018 Mar 15.
The use of outcome-dependent sampling with longitudinal data analysis has previously been shown to improve efficiency in the estimation of regression parameters. The motivating scenario is when outcome data exist for all cohort members but key exposure variables will be gathered only on a subset. Inference with outcome-dependent sampling designs that also incorporates incomplete information from those individuals who did not have their exposure ascertained has been investigated for univariate but not longitudinal outcomes. Therefore, with a continuous longitudinal outcome, we explore the relative contributions of various sources of information toward the estimation of key regression parameters using a likelihood framework. We evaluate the efficiency gains that alternative estimators might offer over random sampling, and we offer insight into their relative merits in select practical scenarios. Finally, we illustrate the potential impact of design and analysis choices using data from the Cystic Fibrosis Foundation Patient Registry.
使用基于结果的抽样与纵向数据分析相结合,以前被证明可以提高回归参数估计的效率。其动机场景是当所有队列成员都有结果数据,但关键的暴露变量只会在一部分人身上收集。已经对基于结果的抽样设计进行了研究,这些设计还包括了那些没有确定暴露情况的个体的不完全信息,但是这些研究只针对单变量结果,而不是纵向结果。因此,对于连续的纵向结果,我们使用似然框架来探索各种信息来源对关键回归参数估计的相对贡献。我们评估了替代估计器相对于随机抽样可能提供的效率增益,并在一些实际情况下深入探讨了它们的相对优点。最后,我们使用来自囊性纤维化基金会患者登记处的数据说明了设计和分析选择的潜在影响。