Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.
Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA.
Nat Genet. 2019 Mar;51(3):568-576. doi: 10.1038/s41588-019-0345-7. Epub 2019 Feb 25.
Transcriptome-wide association analysis is a powerful approach to studying the genetic architecture of complex traits. A key component of this approach is to build a model to impute gene expression levels from genotypes by using samples with matched genotypes and gene expression data in a given tissue. However, it is challenging to develop robust and accurate imputation models with a limited sample size for any single tissue. Here, we first introduce a multi-task learning method to jointly impute gene expression in 44 human tissues. Compared with single-tissue methods, our approach achieved an average of 39% improvement in imputation accuracy and generated effective imputation models for an average of 120% more genes. We describe a summary-statistic-based testing framework that combines multiple single-tissue associations into a powerful metric to quantify the overall gene-trait association. We applied our method, called UTMOST (unified test for molecular signatures), to multiple genome-wide-association results and demonstrate its advantages over single-tissue strategies.
转录组关联分析是研究复杂性状遗传结构的一种强大方法。该方法的一个关键组成部分是构建一个模型,通过使用具有匹配基因型和给定组织中基因表达数据的样本,从基因型推断基因表达水平。然而,对于任何单个组织来说,用有限的样本量开发稳健和准确的推断模型都是具有挑战性的。在这里,我们首先介绍了一种多任务学习方法,用于联合推断 44 个人类组织中的基因表达。与单组织方法相比,我们的方法平均将推断准确性提高了 39%,并为平均多达 120%的更多基因生成了有效的推断模型。我们描述了一个基于汇总统计的测试框架,该框架将多个单组织关联组合成一个强大的指标,以量化总体基因-性状关联。我们将我们的方法(称为 UTMOST,即分子特征的统一测试)应用于多个全基因组关联研究结果,并证明了它相对于单组织策略的优势。