IEEE/ACM Trans Comput Biol Bioinform. 2020 May-Jun;17(3):999-1009. doi: 10.1109/TCBB.2018.2875692. Epub 2018 Oct 12.
Identification of master regulatory genes is one of the primary challenges in systems biology. The minimum dominating set problem is a powerful paradigm in analyzing such complex networks. In these models, genes stand as nodes and their interactions are assumed as edges. Here, members of a minimal dominating set could be regarded as master genes. As finitely many minimum dominating sets may exist in a network, it is difficult to identify which one represents the most appropriate set of master genes. In this paper, we develop a weighted gene regulatory network problem with two objectives as a version of the dominating set problem. Collective influence of each gene is considered as its weight. The first objective aims to find a master regulatory genes set with minimum cardinality, and the second objective identifies the one with maximum weight. The model is converted to a single objective using a parameter varying between zero and one. The model is implemented on three human networks, and the results are reported and compared with the existing model of weighted network. Parametric programming in linear optimization and logistic regression are also implemented on the arisen relaxed problem to provide a deeper understanding of the results. Learned from computational results in parametric analysis, for some ranges of priorities in objectives, the identified master regulatory genes are invariant, while some of them are identified for all priorities. This would be an indication that such genes have higher degree of being master regulatory ones, specially on the noisy networks.
识别主要调控基因是系统生物学面临的主要挑战之一。最小支配集问题是分析此类复杂网络的有力范例。在这些模型中,基因作为节点,它们的相互作用被假设为边。在这里,最小支配集中的成员可以被视为主要基因。由于网络中可能存在有限数量的最小支配集,因此很难确定哪一个代表最合适的主要基因集。在本文中,我们开发了一个具有两个目标的加权基因调控网络问题,作为支配集问题的一个版本。每个基因的集体影响被视为其权重。第一个目标是找到一个具有最小基数的主调控基因集,第二个目标是找到一个具有最大权重的基因集。该模型通过介于 0 和 1 之间的参数转换为单目标。该模型已在三个人类网络上实现,并报告了结果并与现有的加权网络模型进行了比较。在线性优化和逻辑回归中的参数编程也在出现的松弛问题上实现,以提供对结果的更深入理解。从参数分析的计算结果中学习到,对于目标优先级的某些范围,识别出的主要调控基因是不变的,而对于某些优先级,则可以识别出所有的主要调控基因。这表明这些基因具有更高程度的主要调控作用,特别是在嘈杂的网络中。