Umbach D M, Weinberg C R
National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA.
Stat Med. 1997 Aug 15;16(15):1731-43. doi: 10.1002/(sici)1097-0258(19970815)16:15<1731::aid-sim595>3.0.co;2-s.
Genetic susceptibility and environmental exposures play a synergistic role in the aetiology of many diseases. We consider a case-control study of a rare disease in relation to a categorical exposure and a genetic factor under the assumption that the genotype and the exposure occur independently in the population under study. Using a logistic model for risk, we describe maximum likelihood methods based on log-linear models that explicitly impose the independence assumption, something the usual logistic regression analyses cannot do. The estimator of the genotype-exposure interaction effect depends only on data from cases. Estimators for genotype and for exposure effects depend also no data from controls, but only through their respective marginal totals. All three estimators have smaller variance than they would were independence not enforced. These results have important implications for design: (i) Case-only studies can efficiently estimate gene-by-environment interactions. (ii) Studies where controls are genotyped anonymously can estimate genotype, exposure, and interaction effects as efficiently as designs where genotype and exposure data are linked. This feature addresses a growing concern of human subjects review boards. (iii) Exposure and interaction effects, but not genotype effects, can be estimated from studies where genetic information is only collected from cases (although one can recover the genotype effect if external gene prevalence data exist). Such designs have the compensatory benefit that the response rate (hence, validity) is higher when controls are not subjected to intrusive tissue sampling. However, the independence assumption can be checked only with linked genotype and exposure data for some controls. We illustrate the methods by applying them to recent study of cleft palate in relation to maternal cigarette smoking and to a variant of the transforming growth factor alpha gene in the child.
遗传易感性和环境暴露在许多疾病的病因学中起着协同作用。我们考虑一项关于罕见疾病与分类暴露及遗传因素关系的病例对照研究,假设在所研究的人群中基因型和暴露是独立发生的。使用风险的逻辑模型,我们描述了基于对数线性模型的最大似然方法,该方法明确施加了独立性假设,而这是通常的逻辑回归分析无法做到的。基因型 - 暴露交互作用效应的估计仅取决于病例数据。基因型效应和暴露效应的估计也不依赖于对照数据,而是仅通过它们各自的边际总数。与不强制独立性时相比,所有这三个估计量的方差都更小。这些结果对设计具有重要意义:(i)仅病例研究可以有效地估计基因 - 环境交互作用。(ii)对对照进行匿名基因分型的研究能够像将基因型和暴露数据关联起来的设计一样有效地估计基因型、暴露和交互作用效应。这一特性解决了人类受试者审查委员会日益关注的一个问题。(iii)暴露和交互作用效应(但不是基因型效应)可以从仅收集病例遗传信息的研究中估计出来(尽管如果有外部基因流行率数据,就可以恢复基因型效应)。这种设计具有补偿性优势,即当不对对照进行侵入性组织采样时,应答率(从而有效性)更高。然而,独立性假设只能通过一些对照的关联基因型和暴露数据来检验。我们通过将这些方法应用于最近关于腭裂与母亲吸烟以及儿童转化生长因子α基因变体关系的研究来说明这些方法。