Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI 48824, USA.
BEACON Center for the Study of Evolution in Action, Michigan State University, East Lansing, MI 48824, USA.
Philos Trans A Math Phys Eng Sci. 2022 Jul 11;380(2227):20210250. doi: 10.1098/rsta.2021.0250. Epub 2022 May 23.
The information content of symbolic sequences (such as nucleic or amino acid sequences, but also neuronal firings or strings of letters) can be calculated from an ensemble of such sequences, but because information cannot be assigned to single sequences, we cannot correlate information to other observables attached to the sequence. Here we show that an information obtained from multivariate (multiple-variable) correlations within sequences of a 'training' ensemble can be used to predict observables of out-of-sample sequences with an accuracy that scales with the complexity of correlations, showing that functional information emerges from a hierarchy of multi-variable correlations. This article is part of the theme issue 'Emergent phenomena in complex physical and socio-technical systems: from cells to societies'.
符号序列(如核酸或氨基酸序列,但也包括神经元放电或字母串)的信息含量可以从该序列的集合中计算得出,但由于信息不能分配给单个序列,因此我们无法将信息与序列相关联的其他可观察量相关联。在这里,我们表明,可以使用从“训练”集合中的序列的多元(多变量)相关中获得的信息来预测样本外序列的可观察量,其准确性与相关的复杂性成比例,这表明功能信息源自多变量相关的层次结构。本文是“复杂物理和社会技术系统中的涌现现象:从细胞到社会”主题特刊的一部分。