Piasini Eugenio, Filipowicz Alexandre L S, Levine Jonathan, Gold Joshua I
Computational Neuroscience Initiative and Department of Physics and Astronomy, University of Pennsylvania.
Toyota Research Institute.
J Open Res Softw. 2021;9(1). doi: 10.5334/jors.322. Epub 2021 May 31.
We present , a Python package to analyze empirical data using the Information Bottleneck (IB) method and its variants, such as the Deterministic Information Bottleneck (DIB). Given two random variables and , the IB finds the stochastic mapping of that encodes the most information about , subject to a constraint on the information that is allowed to retain about . Despite the popularity of the IB, an accessible implementation of the reference algorithm oriented towards ease of use on empirical data was missing. Embo is optimized for the common case of discrete, low-dimensional data. Embo is fast, provides a standard data-processing pipeline, offers a parallel implementation of key computational steps, and includes reasonable defaults for the method parameters. Embo is broadly applicable to different problem domains, as it can be employed with any dataset consisting in joint observations of two discrete variables. It is available from the Python Package Index (PyPI), Zenodo and GitLab.
我们展示了一个Python包,用于使用信息瓶颈(IB)方法及其变体(如确定性信息瓶颈(DIB))来分析经验数据。给定两个随机变量 和 ,IB找到 的随机映射 ,该映射在对 被允许保留关于 的信息的约束下,编码了关于 的最多信息。尽管IB很受欢迎,但缺少一个面向经验数据易于使用的参考算法的可访问实现。Embo针对离散、低维数据的常见情况进行了优化。Embo速度快,提供标准的数据处理管道,对关键计算步骤提供并行实现,并为方法参数提供合理的默认值。Embo广泛适用于不同的问题领域,因为它可以与任何由两个离散变量的联合观测组成的数据集一起使用。它可从Python包索引(PyPI)、Zenodo和GitLab获取。