Zheng Ruiqing, Li Min, Chen Xiang, Zhao Siyu, Wu Fang-Xiang, Pan Yi, Wang Jianxin
IEEE/ACM Trans Comput Biol Bioinform. 2021 Jan-Feb;18(1):347-354. doi: 10.1109/TCBB.2019.2900614. Epub 2021 Feb 3.
Gene regulatory networks (GRNs) play a key role in biological processes. However, GRNs are diverse under different biological conditions. Reconstructing gene regulatory networks (GRNs) from gene expression has become an important opportunity and challenge in the past decades. Although there are a lot of existing methods to infer the topology of GRNs, such as mutual information, random forest, and partial least squares, the accuracy is still low due to the noise and high dimension of the expression data. In this paper, we introduce an ensemble Multivariate Adaptive Regression Splines (MARS) based method to reconstruct the directed GRNs from multifactorial gene expression data, called PBMarsNet. PBMarsNet incorporates part mutual information (PMI) to pre-weight the candidate regulatory genes and then uses MARS to detect the nonlinear regulatory links. Moreover, we apply bootstrap to run the MARS multiple times and average the outputs of each MARS as the final score of regulatory links. The results on DREAM4 challenge and DREAM5 challenge datasets show PBMarsNet has a superior performance and generalization over other state-of-the-art methods.
基因调控网络(GRNs)在生物过程中起着关键作用。然而,GRNs在不同的生物条件下是多样的。在过去几十年中,从基因表达重建基因调控网络(GRNs)已成为一个重要的机遇和挑战。尽管有许多现有方法来推断GRNs的拓扑结构,如互信息、随机森林和偏最小二乘法,但由于表达数据的噪声和高维度,准确性仍然很低。在本文中,我们介绍了一种基于集成多元自适应回归样条(MARS)的方法,用于从多因素基因表达数据重建有向GRNs,称为PBMarsNet。PBMarsNet结合部分互信息(PMI)对候选调控基因进行预加权,然后使用MARS检测非线性调控链接。此外,我们应用自助法多次运行MARS,并将每个MARS的输出平均作为调控链接的最终得分。在DREAM4挑战和DREAM5挑战数据集上的结果表明,PBMarsNet比其他现有方法具有更好的性能和泛化能力。