Health innovation and Transformation Centre, Federation University, Churchill, Victoria, 3842, Australia.
BioThink Pty Ltd, Australia.
Biosystems. 2022 Oct;220:104736. doi: 10.1016/j.biosystems.2022.104736. Epub 2022 Jul 19.
S-System models, non-linear differential equation models, are widely used for reconstructing gene regulatory networks from temporal gene expression data. An S-System model involves two states, generation and degeneration, and uses the kinetic parameters g and h, to represent the direction, nature, and intensity of the genetic interactions. The need for learning a large number of model parameters results in increased computational expense. Previously, we improved the performance of the algorithm using dynamic allocation of the maximum in-degree for each gene. While the method was effective for smaller networks, a large amount of computation was still needed for larger networks. This problem arose mainly due to the increased occurrence of invalid networks during optimization, primarily because the two kinetic parameters (g and h) of the S-System model converge independently during optimization. Being independent, these two parameters can converge to values that can indicate contradictory gene interactions, specifically inhibition or activation. In this study, to address this major challenge in S-System modelling, we developed a novel method that includes two features: a penalty term that penalizes those networks with invalid kinetic orders, and a parameter, w, derived by combining the kinetic parameters g and h. The novel penalty term was used for candidate selection during the process of optimizing the DRNI (Dynamically Regulated Network Initialization) algorithm. Rather than remaining constant, it is dynamic, with its magnitude dependent on the number of invalid interactions in the given network. This approach encourages the generation of valid candidate solutions, and eliminates invalid networks in a systematic manner. The previous DRNI method, a two-stage approach which uses dynamic allocation of the maximum in-degree for each gene, was further improved by adding a third stage which applies the proposed w to handle the invalid regulations that may still exist in that candidate solutions. The method was tested on different gene expression datasets, and was able to reduce the number of iterations and produce improved network accuracies. For a 20 gene network, the number of generations required for convergence was reduced by 300, and the F-score improved by 0.05 compared to our previously reported DRNI approach. For the well-known 10 gene networks of the DREAM challenge, our method produced an improvement in the average area under the ROC curve of the DREAM4 10 gene networks.
S 系统模型,即非线性微分方程模型,被广泛用于从时间基因表达数据中重建基因调控网络。S 系统模型涉及生成和退化两种状态,并使用动力学参数 g 和 h 来表示遗传相互作用的方向、性质和强度。需要学习大量的模型参数会导致计算开销增加。之前,我们使用为每个基因动态分配最大入度的方法来改进算法的性能。虽然该方法对于较小的网络是有效的,但对于较大的网络仍然需要大量的计算。这个问题主要是由于在优化过程中无效网络的出现频率增加而引起的,主要是因为 S 系统模型的两个动力学参数(g 和 h)在优化过程中独立收敛。由于这两个参数是独立的,它们可以收敛到表示基因相互作用相反的数值,即抑制或激活。在这项研究中,为了解决 S 系统建模中的这个主要挑战,我们开发了一种新方法,该方法包含两个特征:一个惩罚项,用于惩罚那些具有无效动力学顺序的网络;以及一个参数 w,它由 g 和 h 两个动力学参数组合而成。新的惩罚项用于在优化 DRNI(动态调控网络初始化)算法的过程中进行候选网络的选择。与保持不变不同,它是动态的,其大小取决于给定网络中无效相互作用的数量。这种方法鼓励生成有效的候选解决方案,并系统地消除无效网络。之前的 DRNI 方法是一种两阶段方法,为每个基因动态分配最大入度,通过添加第三阶段,应用所提出的 w 来处理候选解决方案中可能仍然存在的无效调控,进一步改进了该方法。该方法在不同的基因表达数据集上进行了测试,能够减少迭代次数并提高网络准确性。对于一个 20 个基因的网络,收敛所需的迭代次数减少了 300 次,与我们之前报道的 DRNI 方法相比,F 分数提高了 0.05。对于 DREAM 挑战赛中著名的 10 个基因网络,我们的方法提高了 DREAM4 10 个基因网络的 ROC 曲线下平均面积。