Lee James J, Vattikuti Shashaank, Chow Carson C
Department of Psychology, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA.
Mathematical Biology Section, NIDDK/LBM, National Institutes of Health, Bethesda, MD 20892, USA.
Comput Struct Biotechnol J. 2015 Nov 23;14:28-34. doi: 10.1016/j.csbj.2015.10.002. eCollection 2016.
The aim of a genome-wide association study (GWAS) is to identify loci in the human genome affecting a phenotype of interest. This review summarizes some recent work on conceptual and methodological aspects of GWAS. The average effect of gene substitution at a given causal site in the genome is the key estimand in GWAS, and we argue for its fundamental importance. Implicit in the definition of average effect is a linear model relating genotype to phenotype. The fraction of the phenotypic variance ascribable to polymorphic sites with nonzero average effects in this linear model is called the heritability, and we describe methods for estimating this quantity from GWAS data. Finally, we show that the theory of compressed sensing can be used to provide a sharp estimate of the sample size required to identify essentially all sites contributing to the heritability of a given phenotype.
全基因组关联研究(GWAS)的目的是在人类基因组中识别影响感兴趣表型的基因座。本综述总结了一些关于GWAS概念和方法方面的近期工作。基因组中给定因果位点的基因替代平均效应是GWAS中的关键估计量,我们认为其具有根本重要性。平均效应的定义中隐含着一个将基因型与表型联系起来的线性模型。在这个线性模型中,可归因于具有非零平均效应的多态性位点的表型方差比例称为遗传力,我们描述了从GWAS数据估计该数量的方法。最后,我们表明压缩感知理论可用于提供对识别基本上所有导致给定表型遗传力的位点所需样本量的精确估计。