Wang Yulin, Lu Na, Miao Hongyu
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China.
State Key Laboratory for Manufacturing Systems Engineering, Systems Engineering Institute, Xi'an Jiaotong University, Xi'an, Shaanxi, China.
BMC Syst Biol. 2016 Jun 13;10(1):41. doi: 10.1186/s12918-016-0287-y.
Graphical models have long been used to describe biological networks for a variety of important tasks such as the determination of key biological parameters, and the structure of graphical model ultimately determines whether such unknown parameters can be unambiguously obtained from experimental observations (i.e., the identifiability problem). Limited by resources or technical capacities, complex biological networks are usually partially observed in experiment, which thus introduces latent variables into the corresponding graphical models. A number of previous studies have tackled the parameter identifiability problem for graphical models such as linear structural equation models (SEMs) with or without latent variables. However, the limited resolution and efficiency of existing approaches necessarily calls for further development of novel structural identifiability analysis algorithms.
An efficient structural identifiability analysis algorithm is developed in this study for a broad range of network structures. The proposed method adopts the Wright's path coefficient method to generate identifiability equations in forms of symbolic polynomials, and then converts these symbolic equations to binary matrices (called identifiability matrix). Several matrix operations are introduced for identifiability matrix reduction with system equivalency maintained. Based on the reduced identifiability matrices, the structural identifiability of each parameter is determined. A number of benchmark models are used to verify the validity of the proposed approach. Finally, the network module for influenza A virus replication is employed as a real example to illustrate the application of the proposed approach in practice.
The proposed approach can deal with cyclic networks with latent variables. The key advantage is that it intentionally avoids symbolic computation and is thus highly efficient. Also, this method is capable of determining the identifiability of each single parameter and is thus of higher resolution in comparison with many existing approaches. Overall, this study provides a basis for systematic examination and refinement of graphical models of biological networks from the identifiability point of view, and it has a significant potential to be extended to more complex network structures or high-dimensional systems.
长期以来,图形模型一直被用于描述生物网络,以完成各种重要任务,如确定关键生物学参数,而图形模型的结构最终决定了能否从实验观测中明确获得此类未知参数(即可识别性问题)。受资源或技术能力的限制,复杂生物网络在实验中通常是部分可观测的,这就将潜在变量引入了相应的图形模型。先前的一些研究已经解决了图形模型(如实线性结构方程模型(SEM),有无潜在变量)的参数可识别性问题。然而,现有方法的分辨率和效率有限,必然需要进一步开发新颖的结构可识别性分析算法。
本研究针对广泛的网络结构开发了一种高效的结构可识别性分析算法。所提出的方法采用赖特路径系数法生成符号多项式形式的可识别性方程,然后将这些符号方程转换为二元矩阵(称为可识别性矩阵)。引入了几种矩阵运算来在保持系统等效性的情况下简化可识别性矩阵。基于简化后的可识别性矩阵,确定每个参数的结构可识别性。使用多个基准模型来验证所提出方法的有效性。最后,以甲型流感病毒复制的网络模块为例,说明所提出方法在实际中的应用。
所提出的方法可以处理具有潜在变量的循环网络。关键优势在于它有意避免了符号计算,因此效率很高。此外,该方法能够确定每个单个参数的可识别性,与许多现有方法相比具有更高的分辨率。总体而言,本研究从可识别性角度为系统检查和完善生物网络的图形模型提供了基础,并且具有扩展到更复杂网络结构或高维系统的巨大潜力。