Wei Jiawei, Carroll Raymond J, Müller Ursula U, Van Keilegom Ingrid, Chatterjee Nilanjan
Texas A&M University, College Station, USA.
J R Stat Soc Series B Stat Methodol. 2013 Jan 1;75(1):185-206. doi: 10.1111/j.1467-9868.2012.01052.x.
Primary analysis of case-control studies focuses on the relationship between disease and a set of covariates of interest (, ). A secondary application of the case-control study, which is often invoked in modern genetic epidemiologic association studies, is to investigate the interrelationship between the covariates themselves. The task is complicated owing to the case-control sampling, where the regression of on is different from what it is in the population. Previous work has assumed a parametric distribution for given and derived semiparametric efficient estimation and inference without any distributional assumptions about . We take up the issue of estimation of a regression function when given follows a homoscedastic regression model, but otherwise the distribution of is unspecified. The semiparametric efficient approaches can be used to construct semiparametric efficient estimates, but they suffer from a lack of robustness to the assumed model for given . We take an entirely different approach. We show how to estimate the regression parameters consistently even if the assumed model for given is incorrect, and thus the estimates are model robust. For this we make the assumption that the disease rate is known or well estimated. The assumption can be dropped when the disease is rare, which is typically so for most case-control studies, and the estimation algorithm simplifies. Simulations and empirical examples are used to illustrate the approach.
病例对照研究的主要分析聚焦于疾病与一组感兴趣的协变量之间的关系(,)。病例对照研究的一个次要应用,在现代遗传流行病学关联研究中经常被采用,是研究协变量自身之间的相互关系。由于病例对照抽样,该任务变得复杂,其中给定条件下的回归与总体中的情况不同。先前的工作假设给定条件下服从参数分布,并在对不做任何分布假设的情况下推导了半参数有效估计和推断。当给定条件下服从同方差回归模型,但的分布未明确指定时,我们着手解决回归函数的估计问题。半参数有效方法可用于构建半参数有效估计,但它们对给定条件下假设的模型缺乏稳健性。我们采用一种完全不同的方法。我们展示了即使给定条件下假设的模型不正确,如何一致地估计回归参数,因此估计是模型稳健的。为此,我们假设疾病发生率是已知的或估计良好的。当疾病罕见时,这个假设可以去掉,大多数病例对照研究通常如此,并且估计算法会简化。通过模拟和实证例子来说明该方法。