Leifeld Thomas, Zhang Zhihua, Zhang Ping
Institute of Automatic Control, Technische Universität Kaiserslautern, Kaiserslautern, Germany.
Front Physiol. 2018 Jun 8;9:695. doi: 10.3389/fphys.2018.00695. eCollection 2018.
Mathematical models take an important place in science and engineering. A model can help scientists to explain dynamic behavior of a system and to understand the functionality of system components. Since length of a time series and number of replicates is limited by the cost of experiments, Boolean networks as a structurally simple and parameter-free logical model for gene regulatory networks have attracted interests of many scientists. In order to fit into the biological contexts and to lower the data requirements, biological prior knowledge is taken into consideration during the inference procedure. In the literature, the existing identification approaches can only deal with a subset of possible types of prior knowledge. We propose a new approach to identify Boolean networks from time series data incorporating prior knowledge, such as partial network structure, canalizing property, positive and negative unateness. Using vector form of Boolean variables and applying a generalized matrix multiplication called the semi-tensor product (STP), each Boolean function can be equivalently converted into a matrix expression. Based on this, the identification problem is reformulated as an integer linear programming problem to reveal the system matrix of Boolean model in a computationally efficient way, whose dynamics are consistent with the important dynamics captured in the data. By using prior knowledge the number of candidate functions can be reduced during the inference. Hence, identification incorporating prior knowledge is especially suitable for the case of small size time series data and data without sufficient stimuli. The proposed approach is illustrated with the help of a biological model of the network of oxidative stress response. The combination of efficient reformulation of the identification problem with the possibility to incorporate various types of prior knowledge enables the application of computational model inference to systems with limited amount of time series data. The general applicability of this methodological approach makes it suitable for a variety of biological systems and of general interest for biological and medical research.
数学模型在科学和工程领域占据重要地位。一个模型可以帮助科学家解释系统的动态行为,并理解系统组件的功能。由于时间序列的长度和重复次数受到实验成本的限制,布尔网络作为一种结构简单且无参数的基因调控网络逻辑模型,吸引了众多科学家的关注。为了契合生物学背景并降低数据要求,在推理过程中会考虑生物学先验知识。在文献中,现有的识别方法只能处理部分可能类型的先验知识。我们提出了一种新方法,用于从包含先验知识(如部分网络结构、渠道化属性、正负单值性)的时间序列数据中识别布尔网络。使用布尔变量的向量形式并应用一种称为半张量积(STP)的广义矩阵乘法,每个布尔函数都可以等效地转换为矩阵表达式。基于此,识别问题被重新表述为一个整数线性规划问题,以便以计算高效的方式揭示布尔模型的系统矩阵,其动态与数据中捕获的重要动态一致。通过使用先验知识,可以在推理过程中减少候选函数的数量。因此,结合先验知识的识别特别适用于小尺寸时间序列数据以及没有足够刺激的数据情况。借助氧化应激反应网络的生物学模型说明了所提出的方法。识别问题的有效重新表述与纳入各种类型先验知识的可能性相结合,使得计算模型推理能够应用于时间序列数据量有限的系统。这种方法的普遍适用性使其适用于各种生物系统,对生物学和医学研究具有普遍意义。