De Martino Andrea, De Martino Daniele
Soft & Living Matter Lab, Institute of Nanotechnology (NANOTEC), Consiglio Nazionale delle Ricerche, Rome, Italy.
Italian Institute for Genomic Medicine (IIGM), Turin, Italy.
Heliyon. 2018 Apr 13;4(4):e00596. doi: 10.1016/j.heliyon.2018.e00596. eCollection 2018 Apr.
A cornerstone of statistical inference, the maximum entropy framework is being increasingly applied to construct descriptive and predictive models of biological systems, especially complex biological networks, from large experimental data sets. Both its broad applicability and the success it obtained in different contexts hinge upon its conceptual simplicity and mathematical soundness. Here we try to concisely review the basic elements of the maximum entropy principle, starting from the notion of 'entropy', and describe its usefulness for the analysis of biological systems. As examples, we focus specifically on the problem of reconstructing gene interaction networks from expression data and on recent work attempting to expand our system-level understanding of bacterial metabolism. Finally, we highlight some extensions and potential limitations of the maximum entropy approach, and point to more recent developments that are likely to play a key role in the upcoming challenges of extracting structures and information from increasingly rich, high-throughput biological data.
作为统计推断的基石,最大熵框架正越来越多地应用于从大型实验数据集中构建生物系统(尤其是复杂生物网络)的描述性和预测性模型。它的广泛适用性以及在不同背景下取得的成功都取决于其概念的简单性和数学的合理性。在这里,我们试图从“熵”的概念出发,简要回顾最大熵原理的基本要素,并描述其在生物系统分析中的作用。作为示例,我们特别关注从表达数据重建基因相互作用网络的问题以及最近试图扩展我们对细菌代谢的系统层面理解的工作。最后,我们强调了最大熵方法的一些扩展和潜在局限性,并指出了一些最新进展,这些进展可能在从日益丰富的高通量生物数据中提取结构和信息的未来挑战中发挥关键作用。