Zhou Qingning, Cai Jianwen, Zhou Haibo
Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A.
Biometrics. 2018 Mar;74(1):58-67. doi: 10.1111/biom.12744. Epub 2017 Aug 3.
Epidemiologic studies and disease prevention trials often seek to relate an exposure variable to a failure time that suffers from interval-censoring. When the failure rate is low and the time intervals are wide, a large cohort is often required so as to yield reliable precision on the exposure-failure-time relationship. However, large cohort studies with simple random sampling could be prohibitive for investigators with a limited budget, especially when the exposure variables are expensive to obtain. Alternative cost-effective sampling designs and inference procedures are therefore desirable. We propose an outcome-dependent sampling (ODS) design with interval-censored failure time data, where we enrich the observed sample by selectively including certain more informative failure subjects. We develop a novel sieve semiparametric maximum empirical likelihood approach for fitting the proportional hazards model to data from the proposed interval-censoring ODS design. This approach employs the empirical likelihood and sieve methods to deal with the infinite-dimensional nuisance parameters, which greatly reduces the dimensionality of the estimation problem and eases the computation difficulty. The consistency and asymptotic normality of the resulting regression parameter estimator are established. The results from our extensive simulation study show that the proposed design and method works well for practical situations and is more efficient than the alternative designs and competing approaches. An example from the Atherosclerosis Risk in Communities (ARIC) study is provided for illustration.
流行病学研究和疾病预防试验常常试图将一个暴露变量与一个存在区间删失的失效时间联系起来。当失效率较低且时间间隔较宽时,通常需要一个大的队列,以便在暴露与失效时间的关系上获得可靠的精度。然而,对于预算有限的研究者来说,采用简单随机抽样的大型队列研究可能成本过高,尤其是当获取暴露变量的成本很高时。因此,需要有成本效益的替代抽样设计和推断程序。我们提出了一种针对区间删失失效时间数据的依结果抽样(ODS)设计,即通过有选择地纳入某些信息更丰富的失效个体来丰富观测样本。我们开发了一种新颖的筛半参数最大经验似然方法,用于将比例风险模型拟合到所提出的区间删失ODS设计的数据中。该方法采用经验似然和筛法来处理无穷维的干扰参数,这大大降低了估计问题的维度并减轻了计算难度。建立了所得回归参数估计量的一致性和渐近正态性。我们广泛的模拟研究结果表明,所提出的设计和方法在实际情况中效果良好,并且比替代设计和竞争方法更有效。文中提供了社区动脉粥样硬化风险(ARIC)研究的一个例子进行说明。