Suppr
超能文献

全基因组关联研究中的基因型填充

Genotype Imputation in Genome-Wide Association Studies.

作者信息

Naj Adam C

机构信息

Department of Biostatistics, Epidemiology, and Informatics and Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania.

Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania.

出版信息

Curr Protoc Hum Genet. 2019 Jun;102(1):e84. doi: 10.1002/cphg.84.

DOI:10.1002/cphg.84

PMID:31216114

Abstract

Genotype imputation infers missing genotypes in silico using haplotype information from reference samples with genotypes from denser genotyping arrays or sequencing. This approach can confer a number of improvements on genome-wide association studies: it can improve statistical power to detect associations by reducing the number of missing genotypes; it can simplify data harmonization for meta-analyses by improving overlap of genomic variants between differently-genotyped sample sets; and it can increase the overall number and density of genomic variants available for association testing. This article reviews the general concepts behind imputation, describes imputation approaches and methods for various types of genotype data, including family-based data, and identifies web-based resources that can be used in different steps of the imputation process. For practical application, it provides a step-by-step guide to implementation of a two-step imputation process consisting of phasing of the study genotypes and the imputation of reference panel genotypes into the study haplotypes. In addition, this review describes recently developed haplotype reference panel resources and online imputation servers that are capable of remotely and securely implementing an imputation workflow on uploaded genotype array data. © 2019 by John Wiley & Sons, Inc.

摘要

基因型填充利用来自参考样本的单倍型信息以及来自密度更高的基因分型阵列或测序的基因型，在计算机上推断缺失的基因型。这种方法可以在全基因组关联研究中带来许多改进：它可以通过减少缺失基因型的数量来提高检测关联的统计效力；它可以通过改善不同基因分型样本集之间基因组变异的重叠来简化荟萃分析的数据协调；并且它可以增加可用于关联测试的基因组变异的总数和密度。本文回顾了填充背后的一般概念，描述了针对各种类型基因型数据（包括基于家系的数据）的填充方法和手段，并识别了可用于填充过程不同步骤的基于网络的资源。对于实际应用，它提供了一个逐步指南，用于实施两步填充过程，该过程包括对研究基因型进行定相以及将参考面板基因型填充到研究单倍型中。此外，本综述描述了最近开发的单倍型参考面板资源和在线填充服务器，它们能够对上传的基因分型阵列数据远程且安全地实施填充工作流程。© 2019 约翰威立父子公司