Institut für Pathologie, Charité-Universitätsmedizin Berlin, Berlin, Germany.
IRI Life Sciences, Humboldt University, Berlin, Germany.
Bioinformatics. 2020 Jul 1;36(Suppl_1):i482-i489. doi: 10.1093/bioinformatics/btaa404.
A common strategy to infer and quantify interactions between components of a biological system is to deduce them from the network's response to targeted perturbations. Such perturbation experiments are often challenging and costly. Therefore, optimizing the experimental design is essential to achieve a meaningful characterization of biological networks. However, it remains difficult to predict which combination of perturbations allows to infer specific interaction strengths in a given network topology. Yet, such a description of identifiability is necessary to select perturbations that maximize the number of inferable parameters.
We show analytically that the identifiability of network parameters can be determined by an intuitive maximum-flow problem. Furthermore, we used the theory of matroids to describe identifiability relationships between sets of parameters in order to build identifiable effective network models. Collectively, these results allowed to device strategies for an optimal design of the perturbation experiments. We benchmarked these strategies on a database of human pathways. Remarkably, full network identifiability was achieved, on average, with less than a third of the perturbations that are needed in a random experimental design. Moreover, we determined perturbation combinations that additionally decreased experimental effort compared to single-target perturbations. In summary, we provide a framework that allows to infer a maximal number of interaction strengths with a minimal number of perturbation experiments.
IdentiFlow is available at github.com/GrossTor/IdentiFlow.
Supplementary data are available at Bioinformatics online.
推断和量化生物系统组件之间相互作用的一种常见策略是根据网络对靶向扰动的响应来推断它们。这种扰动实验通常具有挑战性和成本高。因此,优化实验设计对于实现对生物网络的有意义的描述至关重要。然而,仍然难以预测哪些扰动组合可以在给定的网络拓扑中推断出特定的相互作用强度。然而,这种可识别性的描述对于选择最大程度地增加可推断参数数量的扰动是必要的。
我们通过直观的最大流问题分析表明,可以确定网络参数的可识别性。此外,我们使用拟阵理论来描述参数集之间的可识别性关系,以便构建可识别的有效网络模型。总的来说,这些结果允许为扰动实验的最佳设计制定策略。我们在人类途径数据库上对这些策略进行了基准测试。值得注意的是,平均而言,通过随机实验设计所需的不到三分之一的扰动就可以实现完整的网络可识别性。此外,我们确定了与单目标扰动相比可进一步减少实验工作量的扰动组合。总的来说,我们提供了一个可以用最小数量的扰动实验来推断最大数量的相互作用强度的框架。
IdentiFlow 可在 github.com/GrossTor/IdentiFlow 上获得。
补充数据可在 Bioinformatics 在线获得。