Jafari Mina, Ghavami Behnam, Sattari Vahid
Department of Computer Engineering, Shahid Bahonar University of Kerman, Kerman, Iran.
Artif Intell Med. 2017 Jun;79:15-27. doi: 10.1016/j.artmed.2017.05.004. Epub 2017 Jun 9.
The inference of Gene Regulatory Networks (GRNs) using gene expression data in order to detect the basic cellular processes is a key issue in biological systems. Inferring GRN correctly requires inferring predictor set accurately. In this paper, a fast and accurate predictor set inference framework which linearly combines some inference methods is proposed. The purpose of the combination of various methods is to increase the accuracy of inferred GRN. The proposed framework offers a linear weighted combination of Pearson Correlation Coefficient (PCC) and two different feature selection approaches, namely: Information Gain (IG) and ReliefF. In order to set the appropriate weights, Genetic Algorithm (GA) is used. Similarity measure is considered as fitness function to guide GA. At the end, based on the obtained weights, the best predictor set of GRN using three aforementioned inference methods is selected and the network topology is formed. Due to the huge volume of gene expression data, GRN inference algorithms should infer GRN at a reasonable runtime. Hence, a novel criterion is provided to evaluate GRNs based on runtime and accuracy. The simulation results using biological data indicate that the proposed framework is fast and more reliable compared to other recent methods [1-7].
利用基因表达数据推断基因调控网络(GRN)以检测基本细胞过程是生物系统中的一个关键问题。正确推断GRN需要准确推断预测集。本文提出了一种快速准确的预测集推断框架,该框架将一些推断方法进行线性组合。各种方法组合的目的是提高推断GRN的准确性。所提出的框架提供了皮尔逊相关系数(PCC)与两种不同特征选择方法(即信息增益(IG)和ReliefF)的线性加权组合。为了设置合适的权重,使用了遗传算法(GA)。相似性度量被视为指导GA的适应度函数。最后,基于获得的权重,使用上述三种推断方法选择GRN的最佳预测集并形成网络拓扑。由于基因表达数据量巨大,GRN推断算法应在合理的运行时间内推断GRN。因此,提供了一种基于运行时间和准确性评估GRN的新准则。使用生物数据的模拟结果表明,与其他最近的方法[1 - 7]相比,所提出的框架速度更快且更可靠。