Saito Shigeru, Aburatani Sachiyo, Horimoto Katsuhisa
Biological Network Team, Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo 135-0064, Japan.
BMC Syst Biol. 2008 Oct 1;2:84. doi: 10.1186/1752-0509-2-84.
A knowledge-based network, which is constructed by extracting as many relationships identified by experimental studies as possible and then superimposing them, is one of the promising approaches to investigate the associations between biological molecules. However, the molecular relationships change dynamically, depending on the conditions in a living cell, which suggests implicitly that all of the relationships in the knowledge-based network do not always exist. Here, we propose a novel method to estimate the consistency of a given network with the measured data: i) the network is quantified into a log-likelihood from the measured data, based on the Gaussian network, and ii) the probability of the likelihood corresponding to the measured data, named the graph consistency probability (GCP), is estimated based on the generalized extreme value distribution.
The plausibility and the performance of the present procedure are illustrated by various graphs with simulated data, and with two types of actual gene regulatory networks in Escherichia coli: the SOS DNA repair system with the corresponding data measured by fluorescence, and a set of 29 networks with data measured under anaerobic conditions by microarray. In the simulation study, the procedure for estimating GCP is illustrated by a simple network, and the robustness of the method is scrutinized in terms of various aspects: dimensions of sampling data, parameters in the simulation study, magnitudes of data noise, and variations of network structures. In the actual networks, the former example revealed that our method operates well for an actual network with a size similar to those of the simulated networks, and the latter example illustrated that our method can select the activated network candidates consistent with the actual data measured under specific conditions, among the many network candidates.
The present method shows the possibility of bridging between the static network from the literature and the corresponding measurements, and thus will shed light on the network structure variations in terms of the changes in molecular interaction mechanisms that occur in response to the environment in a living cell.
基于知识的网络是通过尽可能多地提取实验研究确定的关系然后将它们叠加构建而成的,是研究生物分子之间关联的一种有前景的方法。然而,分子关系会根据活细胞中的条件动态变化,这暗示基于知识的网络中的所有关系并非总是存在。在此,我们提出一种新方法来估计给定网络与测量数据的一致性:i)基于高斯网络,将网络从测量数据量化为对数似然,ii)基于广义极值分布估计与测量数据对应的似然概率,即图一致性概率(GCP)。
通过各种具有模拟数据的图以及大肠杆菌中的两种实际基因调控网络说明了本方法的合理性和性能:具有通过荧光测量的相应数据的SOS DNA修复系统,以及一组在厌氧条件下通过微阵列测量数据的29个网络。在模拟研究中,通过一个简单网络说明了估计GCP的过程,并从各个方面仔细检查了该方法的稳健性:采样数据的维度、模拟研究中的参数、数据噪声的大小以及网络结构的变化。在实际网络中,前一个例子表明我们的方法对于大小与模拟网络相似的实际网络运行良好,后一个例子说明我们的方法可以在众多网络候选中选择与特定条件下测量的实际数据一致的激活网络候选。
本方法显示了在文献中的静态网络与相应测量之间架起桥梁的可能性,因此将根据活细胞中响应环境而发生的分子相互作用机制的变化揭示网络结构的变化。