Barcelona Supercomputing Center, Barcelona 08034, Spain.
Universitat de Barcelona, Barcelona 08007, Spain.
Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae650.
Spatial Analysis of Functional Enrichment (SAFE) is a popular tool for biologists to investigate the functional organization of biological networks via highly intuitive 2D functional maps. To create these maps, SAFE uses Spring embedding to project a given network into a 2D space in which nodes connected in the network are near each other in space. However, many biological networks are scale-free, containing highly connected hub nodes. Because Spring embedding fails to separate hub nodes, it provides uninformative embeddings that resemble a 'hairball'. In addition, Spring embedding only captures direct node connectivity in the network and does not consider higher-order node wiring patterns, which are best captured by graphlets, small, connected, nonisomorphic, induced subgraphs. The scale-free structure of biological networks is hypothesized to stem from an underlying low-dimensional hyperbolic geometry, which novel hyperbolic embedding methods try to uncover. These include coalescent embedding, which projects a network onto a 2D disk.
To better capture the functional organization of scale-free biological networks, whilst also going beyond simple direct connectivity patterns, we introduce Graphlet Coalescent (GraCoal) embedding, which embeds nodes nearby on a disk if they frequently co-occur on a given graphlet together. We use GraCoal to extend SAFE-based network analysis. Through SAFE-enabled enrichment analysis, we show that GraCoal outperforms graphlet-based Spring embedding in capturing the functional organization of the genetic interaction networks of fruit fly, budding yeast, fission yeast and Escherichia coli. We show that depending on the underlying graphlet, GraCoal embeddings capture different topology-function relationships. We show that triangle-based GraCoal embedding captures functional redundancies between paralogs.
空间分析功能富集(SAFE)是生物学家用于通过高度直观的 2D 功能图研究生物网络功能组织的流行工具。为了创建这些地图,SAFE 使用弹簧嵌入将给定的网络投影到 2D 空间中,其中网络中连接的节点在空间上彼此靠近。然而,许多生物网络是无标度的,包含高度连接的枢纽节点。由于弹簧嵌入无法分离枢纽节点,因此它提供了类似于“毛球”的无信息嵌入。此外,弹簧嵌入仅捕获网络中的直接节点连接,而不考虑高阶节点布线模式,最好通过图元来捕获,图元是小的、连接的、非同构的、诱导子图。生物网络的无标度结构被假设源于潜在的低维双曲几何,新颖的双曲嵌入方法试图揭示这种几何。其中包括合并嵌入,它将网络投影到 2D 磁盘上。
为了更好地捕获无标度生物网络的功能组织,同时超越简单的直接连接模式,我们引入了图元合并(GraCoal)嵌入,该嵌入将在给定图元上频繁共同出现的节点附近嵌入磁盘上。我们使用 GraCoal 扩展基于 SAFE 的网络分析。通过基于 SAFE 的富集分析,我们表明 GraCoal 在捕获果蝇、酿酒酵母、裂殖酵母和大肠杆菌的遗传相互作用网络的功能组织方面优于基于图元的弹簧嵌入。我们表明,根据基础图元,GraCoal 嵌入捕获不同的拓扑-功能关系。我们表明基于三角形的 GraCoal 嵌入捕获了同源基因之间的功能冗余。