Department of Biology, Georgetown University, 20057 Washington DC, USA.
BMC Bioinformatics. 2014 Jun 25;15:220. doi: 10.1186/1471-2105-15-220.
Community structure is ubiquitous in biological networks. There has been an increased interest in unraveling the community structure of biological systems as it may provide important insights into a system's functional components and the impact of local structures on dynamics at a global scale. Choosing an appropriate community detection algorithm to identify the community structure in an empirical network can be difficult, however, as the many algorithms available are based on a variety of cost functions and are difficult to validate. Even when community structure is identified in an empirical system, disentangling the effect of community structure from other network properties such as clustering coefficient and assortativity can be a challenge.
Here, we develop a generative model to produce undirected, simple, connected graphs with a specified degrees and pattern of communities, while maintaining a graph structure that is as random as possible. Additionally, we demonstrate two important applications of our model: (a) to generate networks that can be used to benchmark existing and new algorithms for detecting communities in biological networks; and (b) to generate null models to serve as random controls when investigating the impact of complex network features beyond the byproduct of degree and modularity in empirical biological networks.
Our model allows for the systematic study of the presence of community structure and its impact on network function and dynamics. This process is a crucial step in unraveling the functional consequences of the structural properties of biological systems and uncovering the mechanisms that drive these systems.
社区结构在生物网络中无处不在。人们越来越感兴趣于揭示生物系统的社区结构,因为它可能为系统的功能组件以及局部结构对全局动态的影响提供重要的见解。然而,选择一种合适的社区检测算法来识别经验网络中的社区结构可能很困难,因为可用的许多算法都是基于各种代价函数,并且难以验证。即使在经验系统中识别出了社区结构,也很难将社区结构的影响与其他网络属性(如聚类系数和聚集度)区分开来。
在这里,我们开发了一种生成模型,用于生成具有指定度数和社区模式的无向、简单、连通图,同时保持尽可能随机的图结构。此外,我们展示了我们模型的两个重要应用:(a)生成可用于基准测试现有和新的生物网络社区检测算法的网络;(b)生成零模型作为随机对照,用于研究复杂网络特征对经验生物网络中除了度数和模块性的副产品之外的影响。
我们的模型允许对社区结构的存在及其对网络功能和动态的影响进行系统研究。这个过程是揭示生物系统结构属性的功能后果和发现驱动这些系统的机制的关键步骤。