Porcu Eleonora, Sanna Serena, Fuchsberger Christian, Fritsche Lars G
Department of Biostatistics, Center for Statistical Genetics, University of Michigan School of Public Health, Ann Arbor, Michigan, USA.
Curr Protoc Hum Genet. 2013 Jul;Chapter 1:Unit 1.25. doi: 10.1002/0471142905.hg0125s78.
Imputation is an in silico method that can increase the power of association studies by inferring missing genotypes, harmonizing data sets for meta-analyses, and increasing the overall number of markers available for association testing. This unit provides an introductory overview of the imputation method and describes a two-step imputation approach that consists of the phasing of the study genotypes and the imputation of reference panel genotypes into the study haplotypes. Detailed steps for data preparation and quality control illustrate how to run the computationally intensive two-step imputation with the high-density reference panels of the 1000 Genomes Project, which currently integrates more than 39 million variants. Additionally, the influence of reference panel selection, input marker density, and imputation settings on imputation quality are demonstrated with a simulated data set to give insight into crucial points of successful genotype imputation.
插补是一种计算机模拟方法,它可以通过推断缺失的基因型、为荟萃分析协调数据集以及增加可用于关联测试的标记总数来提高关联研究的效能。本单元提供了插补方法的入门概述,并描述了一种两步插补方法,该方法包括对研究基因型进行定相以及将参考面板基因型插补到研究单倍型中。数据准备和质量控制的详细步骤说明了如何使用目前整合了超过3900万个变异的千人基因组计划的高密度参考面板来运行计算密集型的两步插补。此外,使用模拟数据集展示了参考面板选择、输入标记密度和插补设置对插补质量的影响,以深入了解成功进行基因型插补的关键点。