Chatterjee Nilanjan, Chen Yi-Hau, Luo Sheng, Carroll Raymond J
Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, DHHS. Rockville MD 20852, U.S.A.
Stat Sci. 2009 Nov 1;24(4):489-502. doi: 10.1214/09-sts297.
Although prospective logistic regression is the standard method of analysis for case-control data, it has been recently noted that in genetic epidemiologic studies one can use the "retrospective" likelihood to gain major power by incorporating various population genetics model assumptions such as Hardy-Weinberg-Equilibrium (HWE), gene-gene and gene-environment independence. In this article, we review these modern methods and contrast them with the more classical approaches through two types of applications (i) association tests for typed and untyped single nucleotide polymorphisms (SNPs) and (ii) estimation of haplotype effects and haplotype-environment interactions in the presence of haplotype-phase ambiguity. We provide novel insights to existing methods by construction of various score-tests and pseudo-likelihoods. In addition, we describe a novel two-stage method for analysis of untyped SNPs that can use any flexible external algorithm for genotype imputation followed by a powerful association test based on the retrospective likelihood. We illustrate applications of the methods using simulated and real data.
尽管前瞻性逻辑回归是病例对照数据的标准分析方法,但最近有人指出,在遗传流行病学研究中,可以通过纳入各种群体遗传学模型假设(如哈迪-温伯格平衡(HWE)、基因-基因和基因-环境独立性)来使用“回顾性”似然性以获得更大的功效。在本文中,我们回顾这些现代方法,并通过两种类型的应用(i)对分型和未分型单核苷酸多态性(SNP)的关联测试,以及(ii)在存在单倍型相位模糊的情况下估计单倍型效应和单倍型-环境相互作用,将它们与更经典的方法进行对比。我们通过构建各种得分检验和伪似然性为现有方法提供了新的见解。此外,我们描述了一种用于分析未分型SNP的新型两阶段方法,该方法可以使用任何灵活的外部算法进行基因型填充,然后基于回顾性似然性进行强大的关联测试。我们使用模拟数据和真实数据说明了这些方法的应用。