Vahedi G, Ivanov I V, Dougherty E R
Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA.
IET Syst Biol. 2009 May;3(3):191-202. doi: 10.1049/iet-syb.2007.0070.
The coefficient of determination (CoD) has been used to infer Boolean networks (BNs) from steady-state data, in particular, to estimate the constituent BNs for a probabilistic BN. The advantage of the CoD method over design methods that emphasise graph topology or attractor structure is that the CoD produces a network based on strong predictive relationships between target genes and their predictor (parent) genes. The disadvantage is that spurious attractor cycles appear in the inferred network, so that there is poor inference relative to the attractor structure, that is, relative to the steady-state behaviour of the network. Given steady-state data, there should not be a significant amount of steady-state probability mass in the inferred network lying outside the mass of the data distribution; however, the existence of spurious attractor cycles creates a significant amount of steady-state probability mass not accounted for by the data. Using steady-state data hampers design because the lack of temporal data causes CoD design to suffer from a lack of directionality with regard to prediction. This results in spurious bidirectional relationships among genes in which two genes are among the predictors for each other, when actually only one of them should be a predictor of the other, thereby creating a spurious attractor cycle. This paper characterises the manner in which bidirectional relationships affect the attractor structure of a BN. Given this characterisation, the authors propose a constrained CoD inference algorithm that outperforms unconstrained CoD inference in avoiding the creation of spurious non-singleton attractor. Algorithm performances are compared using a melanoma-based network.
决定系数(CoD)已被用于从稳态数据推断布尔网络(BNs),特别是用于估计概率布尔网络的组成布尔网络。与强调图拓扑或吸引子结构的设计方法相比,CoD方法的优势在于它基于目标基因与其预测(父)基因之间的强预测关系生成网络。缺点是在推断出的网络中会出现虚假吸引子循环,因此相对于吸引子结构,即相对于网络的稳态行为,推断效果较差。给定稳态数据,在推断出的网络中,不应存在位于数据分布质量之外的大量稳态概率质量;然而,虚假吸引子循环的存在会产生大量未被数据解释的稳态概率质量。使用稳态数据会妨碍设计,因为缺乏时间数据导致CoD设计在预测方面缺乏方向性。这会导致基因之间出现虚假的双向关系,即两个基因互为预测因子,而实际上应该只有其中一个是另一个的预测因子,从而产生虚假吸引子循环。本文描述了双向关系影响布尔网络吸引子结构的方式。基于这一描述,作者提出了一种受限CoD推理算法,该算法在避免创建虚假非单吸引子方面优于无约束CoD推理。使用基于黑色素瘤的网络比较了算法性能。