Department of Genetics, Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina, USA.
Genet Epidemiol. 2010 Dec;34(8):816-34. doi: 10.1002/gepi.20533.
Genome-wide association studies (GWAS) can identify common alleles that contribute to complex disease susceptibility. Despite the large number of SNPs assessed in each study, the effects of most common SNPs must be evaluated indirectly using either genotyped markers or haplotypes thereof as proxies. We have previously implemented a computationally efficient Markov Chain framework for genotype imputation and haplotyping in the freely available MaCH software package. The approach describes sampled chromosomes as mosaics of each other and uses available genotype and shotgun sequence data to estimate unobserved genotypes and haplotypes, together with useful measures of the quality of these estimates. Our approach is already widely used to facilitate comparison of results across studies as well as meta-analyses of GWAS. Here, we use simulations and experimental genotypes to evaluate its accuracy and utility, considering choices of genotyping panels, reference panel configurations, and designs where genotyping is replaced with shotgun sequencing. Importantly, we show that genotype imputation not only facilitates cross study analyses but also increases power of genetic association studies. We show that genotype imputation of common variants using HapMap haplotypes as a reference is very accurate using either genome-wide SNP data or smaller amounts of data typical in fine-mapping studies. Furthermore, we show the approach is applicable in a variety of populations. Finally, we illustrate how association analyses of unobserved variants will benefit from ongoing advances such as larger HapMap reference panels and whole genome shotgun sequencing technologies.
全基因组关联研究 (GWAS) 可以识别导致复杂疾病易感性的常见等位基因。尽管在每项研究中评估了大量的 SNP,但大多数常见 SNP 的效应必须通过使用已分型的标记或其单倍型作为替代物间接进行评估。我们之前在免费的 MaCH 软件包中实现了一种计算效率高的基因分型和单体型推断的马尔可夫链框架。该方法将采样的染色体描述为彼此的马赛克,并使用可用的基因型和鸟枪法序列数据来估计未观测的基因型和单体型,以及这些估计的有用质量度量。我们的方法已经被广泛用于促进跨研究的结果比较以及 GWAS 的荟萃分析。在这里,我们使用模拟和实验基因型来评估其准确性和实用性,同时考虑基因分型面板、参考面板配置以及用鸟枪法测序替代基因分型的设计选择。重要的是,我们表明基因分型不仅有助于跨研究分析,而且还增加了遗传关联研究的功效。我们表明,使用 HapMap 单体型作为参考对常见变体进行基因分型推断非常准确,无论是使用全基因组 SNP 数据还是在精细映射研究中典型的较小数量的数据。此外,我们表明该方法适用于各种人群。最后,我们说明了未观测变体的关联分析将如何受益于正在进行的进展,例如更大的 HapMap 参考面板和全基因组鸟枪法测序技术。