Department of Public Health Sciences, The University of Chicago, Chicago, Illinois 60637, USA.
Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA.
Genome Res. 2017 Nov;27(11):1859-1871. doi: 10.1101/gr.216754.116. Epub 2017 Oct 11.
The impact of inherited genetic variation on gene expression in humans is well-established. The majority of known expression quantitative trait loci (eQTLs) impact expression of local genes (-eQTLs). More research is needed to identify effects of genetic variation on distant genes (-eQTLs) and understand their biological mechanisms. One common -eQTLs mechanism is "mediation" by a local () transcript. Thus, mediation analysis can be applied to genome-wide SNP and expression data in order to identify transcripts that are "-mediators" of -eQTLs, including those "-hubs" involved in regulation of many -genes. Identifying such mediators helps us understand regulatory networks and suggests biological mechanisms underlying -eQTLs, both of which are relevant for understanding susceptibility to complex diseases. The multitissue expression data from the Genotype-Tissue Expression (GTEx) program provides a unique opportunity to study -mediation across human tissue types. However, the presence of complex hidden confounding effects in biological systems can make mediation analyses challenging and prone to confounding bias, particularly when conducted among diverse samples. To address this problem, we propose a new method: Genomic Mediation analysis with Adaptive Confounding adjustment (GMAC). It enables the search of a very large pool of variables, and adaptively selects potential confounding variables for each mediation test. Analyses of simulated data and GTEx data demonstrate that the adaptive selection of confounders by GMAC improves the power and precision of mediation analysis. Application of GMAC to GTEx data provides new insights into the observed patterns of -hubs and -eQTL regulation across tissue types.
人类基因表达的遗传变异影响已得到充分证实。大多数已知的表达数量性状基因座 (eQTL) 会影响局部基因的表达 (-eQTL)。需要进一步研究来确定遗传变异对远距离基因 (-eQTL) 的影响,并了解其生物学机制。一种常见的 -eQTL 机制是局部 () 转录本的“中介”。因此,可以将全基因组 SNP 和表达数据的中介分析应用于识别作为 -eQTL 的“中介”的转录本,包括参与许多 - 基因调控的“- 枢纽”转录本。鉴定这些中介有助于我们理解调控网络,并提示 -eQTL 的生物学机制,这两者对于理解复杂疾病的易感性都很重要。基因型组织表达 (GTEx) 计划的多组织表达数据为研究人类组织类型中的 - 中介提供了独特的机会。然而,生物系统中复杂的隐藏混杂效应的存在可能使中介分析具有挑战性,并容易受到混杂偏差的影响,尤其是在不同样本中进行时。为了解决这个问题,我们提出了一种新方法:具有自适应混杂调整的基因组中介分析 (GMAC)。它能够搜索一个非常大的变量池,并为每个中介测试自适应地选择潜在的混杂变量。模拟数据和 GTEx 数据的分析表明,GMAC 自适应选择混杂因素可提高中介分析的功效和精度。将 GMAC 应用于 GTEx 数据提供了对观察到的组织类型之间 - 枢纽和 -eQTL 调控模式的新见解。