Complex Systems Research Group and Centre for Complex Systems, Faculty of Engineering, The University of Sydney, Sydney, Australia.
School of Physics and EMBL Australia Node Single Molecule Science, School of Medical Sciences, The University of New South Wales, Sydney, Australia.
PLoS Comput Biol. 2021 Apr 19;17(4):e1008054. doi: 10.1371/journal.pcbi.1008054. eCollection 2021 Apr.
Transfer entropy (TE) is a widely used measure of directed information flows in a number of domains including neuroscience. Many real-world time series for which we are interested in information flows come in the form of (near) instantaneous events occurring over time. Examples include the spiking of biological neurons, trades on stock markets and posts to social media, amongst myriad other systems involving events in continuous time throughout the natural and social sciences. However, there exist severe limitations to the current approach to TE estimation on such event-based data via discretising the time series into time bins: it is not consistent, has high bias, converges slowly and cannot simultaneously capture relationships that occur with very fine time precision as well as those that occur over long time intervals. Building on recent work which derived a theoretical framework for TE in continuous time, we present an estimation framework for TE on event-based data and develop a k-nearest-neighbours estimator within this framework. This estimator is provably consistent, has favourable bias properties and converges orders of magnitude more quickly than the current state-of-the-art in discrete-time estimation on synthetic examples. We demonstrate failures of the traditionally-used source-time-shift method for null surrogate generation. In order to overcome these failures, we develop a local permutation scheme for generating surrogate time series conforming to the appropriate null hypothesis in order to test for the statistical significance of the TE and, as such, test for the conditional independence between the history of one point process and the updates of another. Our approach is shown to be capable of correctly rejecting or accepting the null hypothesis of conditional independence even in the presence of strong pairwise time-directed correlations. This capacity to accurately test for conditional independence is further demonstrated on models of a spiking neural circuit inspired by the pyloric circuit of the crustacean stomatogastric ganglion, succeeding where previous related estimators have failed.
转移熵(TE)是一种广泛应用于多个领域的定向信息流度量方法,包括神经科学。我们感兴趣的许多现实世界的时间序列以随时间发生的(近)瞬时事件的形式出现。例如,生物神经元的尖峰、股票市场的交易和社交媒体的帖子,以及涉及连续时间事件的无数其他系统,贯穿自然科学和社会科学。然而,基于离散时间序列为时间箱的当前方法在这种基于事件的数据上存在严重的限制:它不一致,有很高的偏差,收敛缓慢,并且不能同时捕捉到具有非常精细时间精度的关系,以及在长时间间隔内发生的关系。在最近的工作基础上,该工作为连续时间的 TE 推导了一个理论框架,我们提出了一种基于事件数据的 TE 估计框架,并在该框架内开发了一个 k-最近邻居估计器。该估计器是可证明一致的,具有有利的偏差特性,并且在合成示例上的离散时间估计的最新状态下收敛速度快几个数量级。我们证明了传统的源时间移位方法在生成零假设替代数据方面的失败。为了克服这些失败,我们开发了一种局部置换方案来生成符合适当零假设的替代时间序列,以测试 TE 的统计显著性,并因此测试一个点过程的历史与另一个点过程更新之间的条件独立性。我们的方法被证明即使在存在强的成对时间定向相关性的情况下,也能够正确拒绝或接受条件独立性的零假设。这种准确测试条件独立性的能力在受甲壳类动物 stomatogastric 神经节的 pyloric 电路启发的尖峰神经网络电路模型中得到了进一步证明,成功地克服了以前相关估计器的失败。