Hore Victoria, Viñuela Ana, Buil Alfonso, Knight Julian, McCarthy Mark I, Small Kerrin, Marchini Jonathan
Department of Statistics, University of Oxford, Oxford, UK.
Department of Twin Research and Genetic Epidemiology, King's College London, London, UK.
Nat Genet. 2016 Sep;48(9):1094-100. doi: 10.1038/ng.3624. Epub 2016 Aug 1.
Genome-wide association studies of gene expression traits and other cellular phenotypes have successfully identified links between genetic variation and biological processes. The majority of discoveries have uncovered cis-expression quantitative trait locus (eQTL) effects via mass univariate testing of SNPs against gene expression in single tissues. Here we present a Bayesian method for multiple-tissue experiments focusing on uncovering gene networks linked to genetic variation. Our method decomposes the 3D array (or tensor) of gene expression measurements into a set of latent components. We identify sparse gene networks that can then be tested for association against genetic variation across the genome. We apply our method to a data set of 845 individuals from the TwinsUK cohort with gene expression measured via RNA-seq analysis in adipose, lymphoblastoid cell lines (LCLs) and skin. We uncover several gene networks with a genetic basis and clear biological and statistical significance. Extensions of this approach will allow integration of different omics, environmental and phenotypic data sets.
对基因表达性状和其他细胞表型进行的全基因组关联研究已成功识别出遗传变异与生物过程之间的联系。大多数发现是通过在单一组织中对单核苷酸多态性(SNP)与基因表达进行大规模单变量测试,揭示顺式表达数量性状位点(eQTL)效应。在此,我们提出一种针对多组织实验的贝叶斯方法,重点是揭示与遗传变异相关的基因网络。我们的方法将基因表达测量的三维阵列(或张量)分解为一组潜在成分。我们识别出稀疏基因网络,然后可以针对全基因组的遗传变异对其进行关联测试。我们将我们的方法应用于来自TwinsUK队列的845名个体的数据集,通过RNA测序分析在脂肪组织、淋巴母细胞系(LCL)和皮肤中测量基因表达。我们发现了几个具有遗传基础且具有明确生物学和统计学意义的基因网络。这种方法的扩展将允许整合不同的组学、环境和表型数据集。