Pirgazi Jamshid, Khanteymoori Ali Reza, Jalilkhani Maryam
1 Department of Computer Engineering, Engineering Faculty, University of Zanjan, Zanjan, Iran.
2 School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran.
J Bioinform Comput Biol. 2019 Jun;17(3):1950018. doi: 10.1142/S0219720019500185.
In this study, in order to deal with the noise and uncertainty in gene expression data, learning networks, especially Bayesian networks, that have the ability to use prior knowledge, were used to infer gene regulatory network. Learning networks are methods that have the structure of the network and a learning process to obtain relationships. One of the methods which have been used for measuring the relationship between genes is the correlation metrics, but the high correlated genes not necessarily mean that they have causal effect on each other. Studies on common methods in inference of gene regulatory networks are yet to pay attention to their biological importance and as such, predictions by these methods are less accurate in terms of biological significance. Hence, in the proposed method, genes with high correlation were identified in one cluster using clustering, and the existence of edge between the genes in the cluster was prevented. Finally, after the Bayesian network modeling, based on knowledge gained from clustering, the refining phase and improving regulatory interactions using biological correlation were done. In order to show the efficiency, the proposed method has been compared with several common methods in this area including GENIE3 and BMALR. The results of the evaluation indicate that the proposed method recognized regulatory relations in Bayesian modeling process well, due to using of biological knowledge which is hidden in the data collection, and is able to recognize gene regulatory networks align with important methods in this field.
在本研究中,为了处理基因表达数据中的噪声和不确定性,使用了具有利用先验知识能力的学习网络,特别是贝叶斯网络,来推断基因调控网络。学习网络是具有网络结构和获取关系的学习过程的方法。用于测量基因之间关系的方法之一是相关性度量,但高度相关的基因不一定意味着它们彼此具有因果效应。对基因调控网络推断中常用方法的研究尚未关注其生物学重要性,因此,这些方法的预测在生物学意义方面不太准确。因此,在所提出的方法中,使用聚类在一个簇中识别出高度相关的基因,并防止簇中基因之间存在边。最后,在贝叶斯网络建模之后,基于从聚类中获得的知识,进行细化阶段并利用生物学相关性改善调控相互作用。为了展示效率,将所提出的方法与该领域的几种常用方法(包括GENIE3和BMALR)进行了比较。评估结果表明,所提出的方法在贝叶斯建模过程中能够很好地识别调控关系,这是由于利用了隐藏在数据收集过程中的生物学知识,并且能够识别与该领域重要方法一致的基因调控网络。