Shi Min, Umbach David M, Weinberg Clarice R
Biostatistics Branch, National Institute of Environmental Health Sciences, National Institutes of Health, Department of Health and Human Services, Research Triangle Park, NC 27709, USA.
Am J Hum Genet. 2007 Jul;81(1):53-66. doi: 10.1086/518670. Epub 2007 May 15.
Family-based association studies offer robustness to population stratification and can provide insight into maternally mediated and parent-of-origin effects. Usually, such studies investigate multiple markers covering a gene or chromosomal region of interest. We propose a simple and general method to test the association of a disease trait with multiple, possibly linked SNP markers and, subsequently, to nominate a set of "risk-haplotype-tagging alleles." Our test, the max_Zeta(2) test, uses only the genotypes of affected individuals and their parents without requiring the user to either know or assign haplotypes and their phases. It also accommodates sporadically missing SNP data. In the spirit of the pedigree disequilibrium test, our procedure requires only a vector of differences with expected value 0 under the null hypothesis. To enhance power against a range of alternatives when genotype data are complete, we also consider a method for combining multiple tests; here, we combine max_Zeta(2) and Hotelling's Gamma(2). To facilitate discovery of risk-related haplotypes, we develop a simple procedure for nominating risk-haplotype-tagging alleles. Our procedures can also be used to study maternally mediated genetic effects and to explore imprinting. We compare the statistical power of several competing testing procedures through simulation studies of case-parents triads, whose diplotypes are simulated on the basis of draws from the HapMap-based known haplotypes of four genes. In our simulations, the max_Zeta(2) test and the max_TDT (transmission/disequilibrium test) proposed by McIntyre et al. perform almost identically, but max_Zeta(2), unlike max_TDT, extends directly to the investigation of maternal effects. As an illustration, we reanalyze data from a previously reported orofacial cleft study, to now investigate both fetal and maternal effects of the IRF6 gene.
基于家系的关联研究对人群分层具有稳健性,并且可以深入了解母系介导的效应和印记效应。通常,此类研究调查覆盖感兴趣基因或染色体区域的多个标记。我们提出了一种简单通用的方法,用于检验疾病性状与多个可能连锁的单核苷酸多态性(SNP)标记的关联,随后提名一组“风险单倍型标签等位基因”。我们的检验方法,即最大Zeta(2)检验,仅使用患病个体及其父母的基因型,无需用户知晓或确定单倍型及其相位。它还能处理偶尔缺失的SNP数据。本着系谱不平衡检验的精神,我们的方法仅需要一个在原假设下期望值为0的差异向量。为了在基因型数据完整时增强对一系列备择假设的检验效能,我们还考虑了一种组合多个检验的方法;在此,我们将最大Zeta(2)检验和霍特林Gamma(2)检验相结合。为便于发现与风险相关的单倍型,我们开发了一种提名风险单倍型标签等位基因的简单方法。我们的方法还可用于研究母系介导的遗传效应和探索印记现象。我们通过对病例 - 父母三联体的模拟研究比较了几种竞争检验方法的统计效能,其双倍型是基于从四个基因的基于HapMap的已知单倍型中抽样模拟得到的。在我们的模拟中,最大Zeta(2)检验和麦金太尔等人提出的最大传递不平衡检验(TDT)表现几乎相同,但与最大TDT不同的是,最大Zeta(2)检验可直接扩展到对母系效应的研究。作为一个例证,我们重新分析了先前报道的一项口面部裂隙研究的数据,以研究IRF6基因对胎儿和母亲的效应。