Tianjin University, China.
Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China.
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab166.
Lots of biological processes are controlled by gene regulatory networks (GRNs), such as growth and differentiation of cells, occurrence and development of the diseases. Therefore, it is important to persistently concentrate on the research of GRN. The determination of the gene-gene relationships from gene expression data is a complex issue. Since it is difficult to efficiently obtain the regularity behind the gene-gene relationship by only relying on biochemical experimental methods, thus various computational methods have been used to construct GRNs, and some achievements have been made. In this paper, we propose a novel method MMFGRN (for "Multi-source Multi-model Fusion for Gene Regulatory Network reconstruction") to reconstruct the GRN. In order to make full use of the limited datasets and explore the potential regulatory relationships contained in different data types, we construct the MMFGRN model from three perspectives: single time series data model, single steady-data model and time series and steady-data joint model. And, we utilize the weighted fusion strategy to get the final global regulatory link ranking. Finally, MMFGRN model yields the best performance on the DREAM4 InSilico_Size10 data, outperforming other popular inference algorithms, with an overall area under receiver operating characteristic score of 0.909 and area under precision-recall (AUPR) curves score of 0.770 on the 10-gene network. Additionally, as the network scale increases, our method also has certain advantages with an overall AUPR score of 0.335 on the DREAM4 InSilico_Size100 data. These results demonstrate the good robustness of MMFGRN on different scales of networks. At the same time, the integration strategy proposed in this paper provides a new idea for the reconstruction of the biological network model without prior knowledge, which can help researchers to decipher the elusive mechanism of life.
许多生物过程都受到基因调控网络(GRN)的控制,例如细胞的生长和分化、疾病的发生和发展。因此,持续关注 GRN 的研究非常重要。从基因表达数据中确定基因-基因关系是一个复杂的问题。由于仅依靠生化实验方法很难有效地获得基因-基因关系背后的规律,因此已经使用了各种计算方法来构建 GRN,并取得了一些成果。在本文中,我们提出了一种新的方法 MMFGRN(“多源多模型融合用于基因调控网络重建”)来重建 GRN。为了充分利用有限的数据集并探索不同数据类型中包含的潜在调控关系,我们从三个角度构建了 MMFGRN 模型:单时间序列数据模型、单稳态数据模型和时间序列和稳态数据联合模型。并且,我们利用加权融合策略得到最终的全局调控链路排名。最后,MMFGRN 模型在 DREAM4 InSilico_Size10 数据上取得了最佳性能,优于其他流行的推理算法,在 10 基因网络上的接收器操作特征曲线下面积(AUC)为 0.909,精度-召回率(AUPR)曲线下面积(AUPR)为 0.770。此外,随着网络规模的增加,我们的方法在 DREAM4 InSilico_Size100 数据上的整体 AUPR 评分也达到了 0.335,具有一定的优势。这些结果表明 MMFGRN 在不同规模的网络上具有良好的鲁棒性。同时,本文提出的集成策略为没有先验知识的生物网络模型重建提供了新的思路,有助于研究人员破译生命的奥秘机制。