Mark S D, Katki H
Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA. sm7v@.nih.gov
Lifetime Data Anal. 2001 Dec;7(4):331-44. doi: 10.1023/a:1012533130596.
Recognizing that the efficiency in relative risk estimation for the Cox proportional hazards model is largely constrained by the total number of cases, Prentice (1986) proposed the case-cohort design in which covariates are measured on all cases and on a random sample of the cohort. Subsequent to Prentice, other methods of estimation and sampling have been proposed for these designs. We formalize an approach to variance estimation suggested by Barlow (1994), and derive a robust variance estimator based on the influence function. We consider the applicability of the variance estimator to all the proposed case-cohort estimators, and derive the influence function when known sampling probabilities in the estimators are replaced by observed sampling fractions. We discuss the modifications required when cases are missing covariate information. The missingness may occur by chance, and be completely at random; or may occur as part of the sampling design, and depend upon other observed covariates. We provide an adaptation of S-plus code that allows estimating influence function variances in the presence of such missing covariates. Using examples from our current case-cohort studies on esophageal and gastric cancer, we illustrate how our results our useful in solving design and analytic issues that arise in practice.
认识到Cox比例风险模型相对风险估计的效率在很大程度上受病例总数的限制,普伦蒂斯(1986年)提出了病例队列设计,其中协变量在所有病例以及队列的一个随机样本上进行测量。在普伦蒂斯之后,针对这些设计又提出了其他估计和抽样方法。我们将巴洛(1994年)提出的方差估计方法形式化,并基于影响函数推导了一个稳健的方差估计量。我们考虑方差估计量对所有提出的病例队列估计量的适用性,并在估计量中已知的抽样概率被观察到的抽样比例取代时推导影响函数。我们讨论当病例缺少协变量信息时所需的修正。缺失可能是偶然发生的,并且完全是随机的;或者可能作为抽样设计的一部分发生,并且取决于其他观察到的协变量。我们提供了一个S-plus代码的改编版本,它允许在存在此类缺失协变量的情况下估计影响函数方差。利用我们目前关于食管癌和胃癌的病例队列研究中的例子,我们说明了我们的结果如何有助于解决实际中出现的设计和分析问题。