Hasegawa Takanori, Nagasaki Masao, Yamaguchi Rui, Imoto Seiya, Miyano Satoru
Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto, Japan.
Department of Integrative Genomics, Tohoku Medical Megabank Organization, Tohoku University, 6-3-09 Aoba, Aramaki, Aoba-ku, Sendai, Japan.
Biosystems. 2014 Jul;121:54-66. doi: 10.1016/j.biosystems.2014.06.001. Epub 2014 Jun 5.
Recently, several biological simulation models of, e.g., gene regulatory networks and metabolic pathways, have been constructed based on existing knowledge of biomolecular reactions, e.g., DNA-protein and protein-protein interactions. However, since these do not always contain all necessary molecules and reactions, their simulation results can be inconsistent with observational data. Therefore, improvements in such simulation models are urgently required. A previously reported method created multiple candidate simulation models by partially modifying existing models. However, this approach was computationally costly and could not handle a large number of candidates that are required to find models whose simulation results are highly consistent with the data. In order to overcome the problem, we focused on the fact that the qualitative dynamics of simulation models are highly similar if they share a certain amount of regulatory structures. This indicates that better fitting candidates tend to share the basic regulatory structure of the best fitting candidate, which can best predict the data among candidates. Thus, instead of evaluating all candidates, we propose an efficient explorative method that can selectively and sequentially evaluate candidates based on the similarity of their regulatory structures. Furthermore, in estimating the parameter values of a candidate, e.g., synthesis and degradation rates of mRNA, for the data, those of the previously evaluated candidates can be utilized. The method is applied here to the pharmacogenomic pathways for corticosteroids in rats, using time-series microarray expression data. In the performance test, we succeeded in obtaining more than 80% of consistent solutions within 15% of the computational time as compared to the comprehensive evaluation. Then, we applied this approach to 142 literature-recorded simulation models of corticosteroid-induced genes, and consequently selected 134 newly constructed better models. The method described here was found to be capable of efficiently exploring candidate simulation models and obtaining better models within a short span of time. Furthermore, the results suggest that there may be room for improvement in literature recorded pathways and that they can be systematically updated using biological observational data.
最近,基于生物分子反应的现有知识,如DNA - 蛋白质和蛋白质 - 蛋白质相互作用,构建了一些生物模拟模型,例如基因调控网络和代谢途径。然而,由于这些模型并不总是包含所有必要的分子和反应,其模拟结果可能与观测数据不一致。因此,迫切需要改进此类模拟模型。先前报道的一种方法通过部分修改现有模型来创建多个候选模拟模型。然而,这种方法计算成本高昂,并且无法处理大量候选模型,而这些候选模型是找到模拟结果与数据高度一致的模型所必需的。为了克服这个问题,我们关注到这样一个事实:如果模拟模型共享一定数量的调控结构,那么它们的定性动力学非常相似。这表明更好的拟合候选模型往往共享最佳拟合候选模型的基本调控结构,而最佳拟合候选模型在候选模型中对数据的预测能力最强。因此,我们提出了一种有效的探索方法,该方法可以基于候选模型调控结构的相似性,有选择地、顺序地评估候选模型,而不是评估所有候选模型。此外,在估计候选模型的参数值(例如mRNA的合成和降解速率)以拟合数据时,可以利用先前评估的候选模型的参数值。该方法在此处应用于大鼠中皮质类固醇的药物基因组学途径,使用时间序列微阵列表达数据。在性能测试中,与全面评估相比,我们成功地在15%的计算时间内获得了超过80%的一致解。然后,我们将这种方法应用于142个文献记录的皮质类固醇诱导基因的模拟模型,从而选择了新构建的134个更好的模型。结果表明,这里描述的方法能够有效地探索候选模拟模型,并在短时间内获得更好的模型。此外,结果表明文献记录的途径可能还有改进的空间,并且可以使用生物学观测数据进行系统更新。