Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education, Huazhong Agricultural University, Wuhan, Hubei, China.
Department of Biostatistics, University of Michigan, Ann Arbor, MI, United States of America.
PLoS Genet. 2018 Jan 29;14(1):e1007186. doi: 10.1371/journal.pgen.1007186. eCollection 2018 Jan.
Genome-wide association studies (GWASs) have identified many disease associated loci, the majority of which have unknown biological functions. Understanding the mechanism underlying trait associations requires identifying trait-relevant tissues and investigating associations in a trait-specific fashion. Here, we extend the widely used linear mixed model to incorporate multiple SNP functional annotations from omics studies with GWAS summary statistics to facilitate the identification of trait-relevant tissues, with which to further construct powerful association tests. Specifically, we rely on a generalized estimating equation based algorithm for parameter inference, a mixture modeling framework for trait-tissue relevance classification, and a weighted sequence kernel association test constructed based on the identified trait-relevant tissues for powerful association analysis. We refer to our analytic procedure as the Scalable Multiple Annotation integration for trait-Relevant Tissue identification and usage (SMART). With extensive simulations, we show how our method can make use of multiple complementary annotations to improve the accuracy for identifying trait-relevant tissues. In addition, our procedure allows us to make use of the inferred trait-relevant tissues, for the first time, to construct more powerful SNP set tests. We apply our method for an in-depth analysis of 43 traits from 28 GWASs using tissue-specific annotations in 105 tissues derived from ENCODE and Roadmap. Our results reveal new trait-tissue relevance, pinpoint important annotations that are informative of trait-tissue relationship, and illustrate how we can use the inferred trait-relevant tissues to construct more powerful association tests in the Wellcome trust case control consortium study.
全基因组关联研究(GWAS)已经确定了许多与疾病相关的基因座,其中大多数基因座的生物学功能尚不清楚。要了解性状关联的机制,需要确定与性状相关的组织,并以特定于性状的方式进行关联研究。在这里,我们扩展了广泛使用的线性混合模型,将来自组学研究的多个 SNP 功能注释与 GWAS 汇总统计数据结合起来,以促进与性状相关的组织的识别,从而进一步构建强大的关联测试。具体来说,我们依赖于基于广义估计方程的参数推断算法、基于混合模型的性状-组织相关性分类框架,以及基于鉴定的与性状相关的组织构建的加权序列核关联测试,用于强大的关联分析。我们将我们的分析过程称为可扩展的多注释整合用于性状相关组织的鉴定和使用(SMART)。通过广泛的模拟,我们展示了我们的方法如何利用多种互补的注释来提高识别与性状相关的组织的准确性。此外,我们的程序允许我们首次利用推断出的与性状相关的组织来构建更强大的 SNP 集测试。我们使用来自 ENCODE 和 Roadmap 的 105 个组织的组织特异性注释,对 28 个 GWAS 中的 43 个性状进行了深入分析。我们的结果揭示了新的性状-组织相关性,确定了对性状-组织关系有信息的重要注释,并说明了我们如何利用推断出的与性状相关的组织在 Wellcome trust case control consortium 研究中构建更强大的关联测试。