Department of Epidemiology and Biostatistics, Xuzhou Medical University, Xuzhou, Jiangsu, China.
Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.
Bioinformatics. 2018 Aug 15;34(16):2797-2807. doi: 10.1093/bioinformatics/bty204.
Genome-wide association studies (GWASs) have identified many genetic loci associated with complex traits. A substantial fraction of these identified loci is associated with multiple traits-a phenomena known as pleiotropy. Identification of pleiotropic associations can help characterize the genetic relationship among complex traits and can facilitate our understanding of disease etiology. Effective pleiotropic association mapping requires the development of statistical methods that can jointly model multiple traits with genome-wide single nucleic polymorphisms (SNPs) together.
We develop a joint modeling method, which we refer to as the integrative MApping of Pleiotropic association (iMAP). iMAP models summary statistics from GWASs, uses a multivariate Gaussian distribution to account for phenotypic correlation, simultaneously infers genome-wide SNP association pattern using mixture modeling and has the potential to reveal causal relationship between traits. Importantly, iMAP integrates a large number of SNP functional annotations to substantially improve association mapping power, and, with a sparsity-inducing penalty, is capable of selecting informative annotations from a large, potentially non-informative set. To enable scalable inference of iMAP to association studies with hundreds of thousands of individuals and millions of SNPs, we develop an efficient expectation maximization algorithm based on an approximate penalized regression algorithm. With simulations and comparisons to existing methods, we illustrate the benefits of iMAP in terms of both high association mapping power and accurate estimation of genome-wide SNP association patterns. Finally, we apply iMAP to perform a joint analysis of 48 traits from 31 GWAS consortia together with 40 tissue-specific SNP annotations generated from the Roadmap Project.
iMAP is freely available at http://www.xzlab.org/software.html.
Supplementary data are available at Bioinformatics online.
全基因组关联研究 (GWAS) 已经确定了许多与复杂性状相关的遗传位点。这些已确定的位点中有相当一部分与多种性状相关,这种现象称为多效性。识别多效性关联可以帮助我们描述复杂性状之间的遗传关系,并促进我们对疾病病因的理解。有效的多效性关联映射需要开发能够共同建模具有全基因组单核苷酸多态性 (SNP) 的多个性状的统计方法。
我们开发了一种联合建模方法,我们称之为整合多效性关联映射(iMAP)。iMAP 对 GWAS 的汇总统计数据进行建模,使用多元高斯分布来解释表型相关性,同时使用混合模型推断全基因组 SNP 关联模式,并有可能揭示性状之间的因果关系。重要的是,iMAP 整合了大量 SNP 功能注释,大大提高了关联映射的功效,并且通过稀疏诱导惩罚,可以从大量潜在非信息性的注释中选择信息性注释。为了实现对具有数十万个体和数百万 SNP 的关联研究的 iMAP 的可扩展推断,我们开发了一种基于近似惩罚回归算法的高效期望最大化算法。通过模拟和与现有方法的比较,我们从高关联映射功效和全基因组 SNP 关联模式的准确估计两个方面说明了 iMAP 的优势。最后,我们应用 iMAP 对 31 个 GWAS 联盟的 48 个性状进行联合分析,并结合 Roadmap 项目生成的 40 个组织特异性 SNP 注释。
iMAP 可在 http://www.xzlab.org/software.html 上免费获得。
补充数据可在生物信息学在线获得。