Yoon Jun Ho, Kim Seyoung
Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America.
bioRxiv. 2023 Oct 24:2023.10.23.563661. doi: 10.1101/2023.10.23.563661.
Allele-specific expression quantification from RNA-seq reads provides opportunities to study the control of gene regulatory networks by -acting and -acting genetic variants. Many existing methods performed a single-gene and single-SNP association analysis to identify expression quantitative trait loci (eQTLs), and placed the eQTLs against known gene networks for functional interpretation. Instead, we view eQTL data as a capture of the effects of perturbation of gene regulatory system by a large number of genetic variants and reconstruct a gene network perturbed by eQTLs. We introduce a statistical framework called CiTruss for simultaneously learning a gene network and -acting and -acting eQTLs that perturb this network, given population allele-specific expression and SNP data. CiTruss uses a multi-level conditional Gaussian graphical model to model -acting eQTLs perturbing the expression of both alleles in gene network at the top level and -acting eQTLs perturbing the expression of each allele at the bottom level. We derive a transformation of this model that allows efficient learning for large-scale human data. Our analysis of the GTEx and LG×SM advanced intercross line mouse data for multiple tissue types with CiTruss provides new insights into genetics of gene regulation. CiTruss revealed that gene networks consist of local subnetworks over proximally located genes and global subnetworks over genes scattered across genome, and that several aspects of gene regulation by eQTLs such as the impact of genetic diversity, pleiotropy, tissue-specific gene regulation, and local and long-range linkage disequilibrium among eQTLs can be explained through these local and global subnetworks.
从RNA测序读数中进行等位基因特异性表达定量,为研究顺式作用和反式作用遗传变异对基因调控网络的控制提供了机会。许多现有方法进行单基因和单SNP关联分析以鉴定表达数量性状基因座(eQTL),并将eQTL与已知基因网络进行比对以进行功能解释。相反,我们将eQTL数据视为大量遗传变异对基因调控系统扰动效应的一种捕获,并重建受eQTL扰动的基因网络。我们引入了一个名为CiTruss的统计框架,用于在给定群体等位基因特异性表达和SNP数据的情况下,同时学习基因网络以及扰动该网络的顺式作用和反式作用eQTL。CiTruss使用多级条件高斯图形模型,在顶层对扰动基因网络中两个等位基因表达的顺式作用eQTL进行建模,在底层对扰动每个等位基因表达的反式作用eQTL进行建模。我们推导了该模型的一种变换,使得能够对大规模人类数据进行高效学习。我们使用CiTruss对GTEx和LG×SM高级杂交系小鼠多种组织类型的数据进行分析,为基因调控遗传学提供了新的见解。CiTruss揭示,基因网络由近端基因上的局部子网和基因组中分散基因上的全局子网组成,并且eQTL基因调控的几个方面,如遗传多样性的影响、多效性、组织特异性基因调控以及eQTL之间的局部和长程连锁不平衡,都可以通过这些局部和全局子网来解释。