Suppr超能文献

作为基于集合的复杂性的生物信息

Biological Information as Set-Based Complexity.

作者信息

Galas David J, Nykter Matti, Carter Gregory W, Price Nathan D, Shmulevich Ilya

机构信息

Institute for Systems Biology, Seattle, WA, USA; Battelle Memorial Institute, Columbus, OH, USA.

Institute for Systems Biology, Seattle, WA, USA; Institute of Signal Processing, Tampere University of Technology, Tampere, Finland.

出版信息

IEEE Trans Inf Theory. 2010 Feb;56(2):667-677. doi: 10.1109/TIT.2009.2037046. Epub 2010 Feb 25.

Abstract

It is not obvious what fraction of all the potential information residing in the molecules and structures of living systems is significant or meaningful to the system. Sets of random sequences or identically repeated sequences, for example, would be expected to contribute little or no useful information to a cell. This issue of quantitation of information is important since the ebb and flow of biologically significant information is essential to our quantitative understanding of biological function and evolution. Motivated specifically by these problems of biological information, we propose here a class of measures to quantify the contextual nature of the information in sets of objects, based on Kolmogorov's intrinsic complexity. Such measures discount both random and redundant information and are inherent in that they do not require a defined state space to quantify the information. The maximization of this new measure, which can be formulated in terms of the universal information distance, appears to have several useful and interesting properties, some of which we illustrate with examples.

摘要

目前尚不清楚存在于生命系统分子和结构中的所有潜在信息中,有多大比例对该系统具有重要意义或有实际意义。例如,随机序列集或完全重复的序列集预计对细胞贡献很少或没有有用信息。信息定量问题很重要,因为具有生物学意义的信息的起伏对于我们定量理解生物学功能和进化至关重要。特别是受这些生物信息问题的推动,我们在此提出一类基于柯尔莫哥洛夫内在复杂性来量化对象集中信息的上下文性质的度量。此类度量会剔除随机信息和冗余信息,并且其固有特性在于它们不需要定义状态空间来量化信息。这种新度量的最大化(可根据通用信息距离来表述)似乎具有若干有用且有趣的特性,我们将通过示例对其中一些特性进行说明。

相似文献

1
Biological Information as Set-Based Complexity.
IEEE Trans Inf Theory. 2010 Feb;56(2):667-677. doi: 10.1109/TIT.2009.2037046. Epub 2010 Feb 25.
5
A probabilistic framework for identifying biosignatures using Pathway Complexity.
Philos Trans A Math Phys Eng Sci. 2017 Dec 28;375(2109). doi: 10.1098/rsta.2016.0342.
7
A modular hierarchy-based theory of the chemical origins of life based on molecular complementarity.
Acc Chem Res. 2012 Dec 18;45(12):2169-77. doi: 10.1021/ar200209k. Epub 2012 Feb 27.
8
Comparison study on k-word statistical measures for protein: from sequence to 'sequence space'.
BMC Bioinformatics. 2008 Sep 23;9:394. doi: 10.1186/1471-2105-9-394.

引用本文的文献

1
LARGE-SCALE MULTIPLE INFERENCE OF COLLECTIVE DEPENDENCE WITH APPLICATIONS TO PROTEIN FUNCTION.
Ann Appl Stat. 2021 Jun;15(2):902-924. doi: 10.1214/20-aoas1431. Epub 2021 Jul 12.
2
Toward an Information Theory of Quantitative Genetics.
J Comput Biol. 2021 Jun;28(6):527-559. doi: 10.1089/cmb.2020.0032. Epub 2020 Dec 31.
3
Symmetries among Multivariate Information Measures Explored Using Möbius Operators.
Entropy (Basel). 2019 Jan 18;21(1):88. doi: 10.3390/e21010088.
5
Multivariate Analysis of Data Sets with Missing Values: An Information Theory-Based Reliability Function.
J Comput Biol. 2019 Feb;26(2):152-171. doi: 10.1089/cmb.2018.0179. Epub 2018 Nov 29.
6
The Information Content of Discrete Functions and Their Application in Genetic Data Analysis.
J Comput Biol. 2017 Dec;24(12):1153-1178. doi: 10.1089/cmb.2017.0143. Epub 2017 Oct 13.
7
Biological data analysis as an information theory problem: multivariable dependence measures and the shadows algorithm.
J Comput Biol. 2015 Nov;22(11):1005-24. doi: 10.1089/cmb.2015.0051. Epub 2015 Sep 3.
8
Maximizing Kolmogorov Complexity for accurate and robust bright field cell segmentation.
BMC Bioinformatics. 2014 Jan 30;15:32. doi: 10.1186/1471-2105-15-32.
9
Describing the complexity of systems: multivariable "set complexity" and the information basis of systems biology.
J Comput Biol. 2014 Feb;21(2):118-40. doi: 10.1089/cmb.2013.0039. Epub 2013 Dec 30.
10
Information theory applications for biological sequence analysis.
Brief Bioinform. 2014 May;15(3):376-89. doi: 10.1093/bib/bbt068. Epub 2013 Sep 20.

本文引用的文献

2
Dynamical analysis of a generic Boolean model for the control of the mammalian cell cycle.
Bioinformatics. 2006 Jul 15;22(14):e124-31. doi: 10.1093/bioinformatics/btl210.
3
Perturbation avalanches and criticality in gene regulatory networks.
J Theor Biol. 2006 Sep 7;242(1):164-70. doi: 10.1016/j.jtbi.2006.02.011. Epub 2006 Mar 30.
4
Eukaryotic cells are dynamically ordered or critical but not chaotic.
Proc Natl Acad Sci U S A. 2005 Sep 20;102(38):13439-44. doi: 10.1073/pnas.0506771102. Epub 2005 Sep 9.
6
Activities and sensitivities in boolean network models.
Phys Rev Lett. 2004 Jul 23;93(4):048701. doi: 10.1103/PhysRevLett.93.048701. Epub 2004 Jul 22.
7
The yeast cell-cycle network is robustly designed.
Proc Natl Acad Sci U S A. 2004 Apr 6;101(14):4781-6. doi: 10.1073/pnas.0305937101. Epub 2004 Mar 22.
8
Genetic network models and statistical properties of gene expression data in knock-out experiments.
J Theor Biol. 2004 Mar 7;227(1):149-57. doi: 10.1016/j.jtbi.2003.10.018.
10
The digital code of DNA.
Nature. 2003 Jan 23;421(6921):444-8. doi: 10.1038/nature01410.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验