Institut Pasteur, Université Paris Cité, CNRS UMR 3751, Decision and Bayesian Computation, Paris, France.
Epiméthée, Inria, Paris, France.
PLoS Comput Biol. 2024 Oct 10;20(10):e1012460. doi: 10.1371/journal.pcbi.1012460. eCollection 2024 Oct.
Physical and functional constraints on biological networks lead to complex topological patterns across multiple scales in their organization. A particular type of higher-order network feature that has received considerable interest is network motifs, defined as statistically regular subgraphs. These may implement fundamental logical and computational circuits and are referred to as "building blocks of complex networks". Their well-defined structures and small sizes also enable the testing of their functions in synthetic and natural biological experiments. Here, we develop a framework for motif mining based on lossless network compression using subgraph contractions. This provides an alternative definition of motif significance which allows us to compare different motifs and select the collectively most significant set of motifs as well as other prominent network features in terms of their combined compression of the network. Our approach inherently accounts for multiple testing and correlations between subgraphs and does not rely on a priori specification of an appropriate null model. It thus overcomes common problems in hypothesis testing-based motif analysis and guarantees robust statistical inference. We validate our methodology on numerical data and then apply it on synaptic-resolution biological neural networks, as a medium for comparative connectomics, by evaluating their respective compressibility and characterize their inferred circuit motifs.
生物网络的物理和功能约束导致其组织在多个尺度上呈现出复杂的拓扑模式。网络基元是一种受到广泛关注的特殊高阶网络特征,定义为统计规则的子图。这些基元可能实现基本的逻辑和计算电路,因此被称为“复杂网络的构建块”。它们明确的结构和较小的尺寸也使得它们能够在合成和自然生物实验中测试其功能。在这里,我们开发了一种基于子图收缩的无损网络压缩的基元挖掘框架。这提供了基元重要性的另一种定义,使我们能够比较不同的基元,并根据它们对网络的综合压缩选择整体上最重要的基元集以及其他突出的网络特征。我们的方法内在地考虑了子图之间的多重检验和相关性,并且不依赖于适当的零模型的先验指定。因此,它克服了基于假设检验的基元分析中的常见问题,并保证了稳健的统计推断。我们在数值数据上验证了我们的方法,然后将其应用于突触分辨率生物神经网络上,作为比较连接组学的一种媒介,通过评估它们各自的可压缩性并描述它们推断出的电路基元。