Galas David J, Sakhanenko Nikita A, Skupin Alexander, Ignac Tomasz
1 Pacific Northwest Diabetes Research Institute , Seattle, Washington.
J Comput Biol. 2014 Feb;21(2):118-40. doi: 10.1089/cmb.2013.0039. Epub 2013 Dec 30.
Context dependence is central to the description of complexity. Keying on the pairwise definition of "set complexity," we use an information theory approach to formulate general measures of systems complexity. We examine the properties of multivariable dependency starting with the concept of interaction information. We then present a new measure for unbiased detection of multivariable dependency, "differential interaction information." This quantity for two variables reduces to the pairwise "set complexity" previously proposed as a context-dependent measure of information in biological systems. We generalize it here to an arbitrary number of variables. Critical limiting properties of the "differential interaction information" are key to the generalization. This measure extends previous ideas about biological information and provides a more sophisticated basis for the study of complexity. The properties of "differential interaction information" also suggest new approaches to data analysis. Given a data set of system measurements, differential interaction information can provide a measure of collective dependence, which can be represented in hypergraphs describing complex system interaction patterns. We investigate this kind of analysis using simulated data sets. The conjoining of a generalized set complexity measure, multivariable dependency analysis, and hypergraphs is our central result. While our focus is on complex biological systems, our results are applicable to any complex system.
上下文依赖性是复杂性描述的核心。基于“集合复杂性”的成对定义,我们采用信息论方法来制定系统复杂性的通用度量。我们从交互信息的概念出发研究多变量依赖性的性质。然后,我们提出了一种用于无偏检测多变量依赖性的新度量——“微分交互信息”。对于两个变量,这个量简化为先前提出的作为生物系统中信息的上下文相关度量的成对“集合复杂性”。我们在此将其推广到任意数量的变量。“微分交互信息”的关键极限性质是推广的关键。这个度量扩展了先前关于生物信息的概念,并为复杂性研究提供了更完善的基础。“微分交互信息”的性质还为数据分析提出了新方法。给定一个系统测量的数据集,微分交互信息可以提供集体依赖性的度量,这可以在描述复杂系统交互模式的超图中表示出来。我们使用模拟数据集研究这种分析。广义集合复杂性度量、多变量依赖性分析和超图的结合是我们的核心成果。虽然我们关注的是复杂生物系统,但我们的结果适用于任何复杂系统。