Institute of Science and Technology Austria, A-3400 Klosterneuburg, Austria.
Inria Saclay - Ile-de-France, F-91120 Palaiseau, France.
PLoS Comput Biol. 2019 Sep 3;15(9):e1007290. doi: 10.1371/journal.pcbi.1007290. eCollection 2019 Sep.
Across diverse biological systems-ranging from neural networks to intracellular signaling and genetic regulatory networks-the information about changes in the environment is frequently encoded in the full temporal dynamics of the network nodes. A pressing data-analysis challenge has thus been to efficiently estimate the amount of information that these dynamics convey from experimental data. Here we develop and evaluate decoding-based estimation methods to lower bound the mutual information about a finite set of inputs, encoded in single-cell high-dimensional time series data. For biological reaction networks governed by the chemical Master equation, we derive model-based information approximations and analytical upper bounds, against which we benchmark our proposed model-free decoding estimators. In contrast to the frequently-used k-nearest-neighbor estimator, decoding-based estimators robustly extract a large fraction of the available information from high-dimensional trajectories with a realistic number of data samples. We apply these estimators to previously published data on Erk and Ca2+ signaling in mammalian cells and to yeast stress-response, and find that substantial amount of information about environmental state can be encoded by non-trivial response statistics even in stationary signals. We argue that these single-cell, decoding-based information estimates, rather than the commonly-used tests for significant differences between selected population response statistics, provide a proper and unbiased measure for the performance of biological signaling networks.
在从神经网络到细胞内信号传递和基因调控网络等各种生物系统中,环境变化的信息经常被编码在网络节点的完整时间动态中。因此,一个紧迫的数据分析挑战是如何从实验数据中有效地估计这些动态传递的信息量。在这里,我们开发和评估了基于解码的估计方法,以从单细胞高维时间序列数据中对有限输入集的互信息量进行下界估计。对于受化学主方程控制的生物反应网络,我们推导出基于模型的信息近似值和分析上界,并将其与我们提出的无模型解码估计器进行基准测试。与常用的最近邻估计器相比,基于解码的估计器可以从具有实际数据样本数量的高维轨迹中稳健地提取大量可用信息。我们将这些估计器应用于先前发表的关于哺乳动物细胞中 Erk 和 Ca2+信号转导以及酵母应激反应的数据,并发现即使在静止信号中,非平凡的响应统计信息也可以编码大量关于环境状态的信息。我们认为,这些基于单细胞的解码信息量估计,而不是常用的选择群体响应统计量之间差异的显著检验,为生物信号网络的性能提供了一种合适且无偏的度量。