Wang Jian, Shete Sanjay
Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA.
Department of Epidemiology, University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA.
Methods Mol Biol. 2017;1666:83-115. doi: 10.1007/978-1-4939-7274-6_6.
The Hardy-Weinberg principle, one of the most important principles in population genetics, was originally developed for the study of allele frequency changes in a population over generations. It is now, however, widely used in studies of human diseases to detect inbreeding, population stratification, and genotyping errors. For assessment of deviation from Hardy-Weinberg proportions in data, the most popular approaches include the asymptotic Pearson's chi-squared goodness-of-fit test and the exact test. Pearson's chi-squared goodness-of-fit test is simple and straightforward, but is very sensitive to a small sample size or rare allele frequency. The exact test of Hardy-Weinberg proportions is preferable in these situations. The exact test can be performed through complete enumeration of heterozygote genotypes or on the basis of the Markov chain Monte Carlo procedure. In this chapter, we describe the Hardy-Weinberg principle and the commonly used Hardy-Weinberg proportion tests and their applications, and we demonstrate how the chi-squared test and exact test of Hardy-Weinberg proportions can be performed step-by-step using the popular software programs SAS, R, and PLINK, which have been widely used in genetic association studies, along with numerical examples. We also discuss approaches for testing Hardy-Weinberg proportions in case-control study designs that are better than traditional approaches for testing Hardy-Weinberg proportions in controls only. Finally, we note that deviation from the Hardy-Weinberg proportions in affected individuals can provide evidence for an association between genetic variants and diseases.
哈迪-温伯格原理是群体遗传学中最重要的原理之一,最初是为研究群体中世代间等位基因频率的变化而提出的。然而,如今它在人类疾病研究中被广泛用于检测近亲繁殖、群体分层和基因分型错误。对于评估数据中与哈迪-温伯格比例的偏差,最常用的方法包括渐近皮尔逊卡方拟合优度检验和精确检验。皮尔逊卡方拟合优度检验简单直接,但对小样本量或稀有等位基因频率非常敏感。在这些情况下,哈迪-温伯格比例的精确检验更为可取。精确检验可以通过对杂合子基因型进行完全枚举或基于马尔可夫链蒙特卡罗方法来进行。在本章中,我们描述了哈迪-温伯格原理、常用的哈迪-温伯格比例检验及其应用,并展示了如何使用在基因关联研究中广泛使用的流行软件程序SAS、R和PLINK,以及数值示例,逐步进行哈迪-温伯格比例的卡方检验和精确检验。我们还讨论了在病例对照研究设计中检验哈迪-温伯格比例的方法,这些方法比仅在对照组中检验哈迪-温伯格比例的传统方法更好。最后,我们指出受影响个体中与哈迪-温伯格比例的偏差可以为基因变异与疾病之间的关联提供证据。