Zhang Lu, Feng Xi Kang, Ng Yen Kaow, Li Shuai Cheng
Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong.
Faculty of Information and Communication Technology, University Tunku Abdul Rahman, Kampar, Perak, Malaysia.
BMC Genomics. 2016 Aug 18;17 Suppl 4(Suppl 4):430. doi: 10.1186/s12864-016-2791-2.
Accurately identifying gene regulatory network is an important task in understanding in vivo biological activities. The inference of such networks is often accomplished through the use of gene expression data. Many methods have been developed to evaluate gene expression dependencies between transcription factor and its target genes, and some methods also eliminate transitive interactions. The regulatory (or edge) direction is undetermined if the target gene is also a transcription factor. Some methods predict the regulatory directions in the gene regulatory networks by locating the eQTL single nucleotide polymorphism, or by observing the gene expression changes when knocking out/down the candidate transcript factors; regrettably, these additional data are usually unavailable, especially for the samples deriving from human tissues.
In this study, we propose the Context Based Dependency Network (CBDN), a method that is able to infer gene regulatory networks with the regulatory directions from gene expression data only. To determine the regulatory direction, CBDN computes the influence of source to target by evaluating the magnitude changes of expression dependencies between the target gene and the others with conditioning on the source gene. CBDN extends the data processing inequality by involving the dependency direction to distinguish between direct and transitive relationship between genes. We also define two types of important regulators which can influence a majority of the genes in the network directly or indirectly. CBDN can detect both of these two types of important regulators by averaging the influence functions of candidate regulator to the other genes. In our experiments with simulated and real data, even with the regulatory direction taken into account, CBDN outperforms the state-of-the-art approaches for inferring gene regulatory network. CBDN identifies the important regulators in the predicted network: 1. TYROBP influences a batch of genes that are related to Alzheimer's disease; 2. ZNF329 and RB1 significantly regulate those 'mesenchymal' gene expression signature genes for brain tumors.
By merely leveraging gene expression data, CBDN can efficiently infer the existence of gene-gene interactions as well as their regulatory directions. The constructed networks are helpful in the identification of important regulators for complex diseases.
准确识别基因调控网络是理解体内生物活性的一项重要任务。此类网络的推断通常通过使用基因表达数据来完成。已经开发了许多方法来评估转录因子与其靶基因之间的基因表达依赖性,并且一些方法还消除了传递相互作用。如果靶基因也是转录因子,则调控(或边)的方向是不确定的。一些方法通过定位eQTL单核苷酸多态性,或通过观察敲除/下调候选转录因子时的基因表达变化来预测基因调控网络中的调控方向;遗憾的是,这些额外的数据通常不可用,特别是对于源自人体组织的样本。
在本研究中,我们提出了基于上下文的依赖性网络(CBDN),这是一种仅从基因表达数据就能推断出具有调控方向的基因调控网络的方法。为了确定调控方向,CBDN通过评估靶基因与其他基因之间的表达依赖性的大小变化(以源基因为条件)来计算源对靶的影响。CBDN通过纳入依赖性方向来扩展数据处理不等式,以区分基因之间的直接和传递关系。我们还定义了两种重要的调节因子,它们可以直接或间接影响网络中的大多数基因。CBDN可以通过平均候选调节因子对其他基因的影响函数来检测这两种重要的调节因子。在我们对模拟数据和真实数据的实验中,即使考虑了调控方向,CBDN在推断基因调控网络方面也优于现有方法。CBDN在预测网络中识别出重要的调节因子:(1)TYROBP影响一批与阿尔茨海默病相关的基因;(2)ZNF329和RB1显著调节脑肿瘤的那些“间充质”基因表达特征基因。
仅通过利用基因表达数据,CBDN就能有效地推断基因-基因相互作用的存在及其调控方向。构建的网络有助于识别复杂疾病的重要调节因子。