Zenil Hector, Kiani Narsis A, Tegnér Jesper
Unit of Computational Medicine, Department of Medicine, Karolinska Institute & Center for Molecular Medicine, Karolinska University Hospital, Stockholm, Sweden.
Unit of Computational Medicine, Department of Medicine, Karolinska Institute & Center for Molecular Medicine, Karolinska University Hospital, Stockholm, Sweden.
Semin Cell Dev Biol. 2016 Mar;51:32-43. doi: 10.1016/j.semcdb.2016.01.011. Epub 2016 Jan 21.
We survey and introduce concepts and tools located at the intersection of information theory and network biology. We show that Shannon's information entropy, compressibility and algorithmic complexity quantify different local and global aspects of synthetic and biological data. We show examples such as the emergence of giant components in Erdös-Rényi random graphs, and the recovery of topological properties from numerical kinetic properties simulating gene expression data. We provide exact theoretical calculations, numerical approximations and error estimations of entropy, algorithmic probability and Kolmogorov complexity for different types of graphs, characterizing their variant and invariant properties. We introduce formal definitions of complexity for both labeled and unlabeled graphs and prove that the Kolmogorov complexity of a labeled graph is a good approximation of its unlabeled Kolmogorov complexity and thus a robust definition of graph complexity.
我们调研并介绍了位于信息论与网络生物学交叉领域的概念和工具。我们表明,香农信息熵、可压缩性和算法复杂度量化了合成数据和生物数据不同的局部和全局特征。我们展示了诸如厄多斯 - 雷尼随机图中巨分量的出现,以及从模拟基因表达数据的数值动力学性质中恢复拓扑性质等示例。我们为不同类型的图提供了熵、算法概率和柯尔莫哥洛夫复杂度的精确理论计算、数值近似和误差估计,刻画了它们的可变和不变性质。我们引入了有标签和无标签图复杂度的形式化定义,并证明有标签图的柯尔莫哥洛夫复杂度是其无标签柯尔莫哥洛夫复杂度的良好近似,因此是图复杂度的一个稳健定义。