Statistical and Biological Physics Research Group of the Hungarian Academy of Sciences, Eötvös University, Budapest, Hungary.
Proc Natl Acad Sci U S A. 2010 Apr 27;107(17):7640-5. doi: 10.1073/pnas.0912983107. Epub 2010 Apr 12.
We introduce a new approach to constructing networks with realistic features. Our method, in spite of its conceptual simplicity (it has only two parameters) is capable of generating a wide variety of network types with prescribed statistical properties, e.g., with degree or clustering coefficient distributions of various, very different forms. In turn, these graphs can be used to test hypotheses or as models of actual data. The method is based on a mapping between suitably chosen singular measures defined on the unit square and sparse infinite networks. Such a mapping has the great potential of allowing for graph theoretical results for a variety of network topologies. The main idea of our approach is to go to the infinite limit of the singular measure and the size of the corresponding graph simultaneously. A very unique feature of this construction is that with the increasing system size the generated graphs become topologically more structured. We present analytic expressions derived from the parameters of the--to be iterated--initial generating measure for such major characteristics of graphs as their degree, clustering coefficient, and assortativity coefficient distributions. The optimal parameters of the generating measure are determined from a simple simulated annealing process. Thus, the present work provides a tool for researchers from a variety of fields (such as biology, computer science, biology, or complex systems) enabling them to create a versatile model of their network data.
我们介绍了一种构建具有真实特征的网络的新方法。尽管我们的方法概念简单(只有两个参数),但它能够生成具有规定统计特性的各种网络类型,例如,具有各种非常不同形式的度或聚类系数分布。反过来,这些图可以用于测试假设或作为实际数据的模型。该方法基于在单位正方形上定义的适当选择的奇异测度与稀疏无限网络之间的映射。这种映射具有允许各种网络拓扑的图论结果的巨大潜力。我们方法的主要思想是同时趋近奇异测度和相应图的无限极限。这种构造的一个非常独特的特征是,随着系统尺寸的增加,生成的图在拓扑上变得更加结构化。我们为图的主要特征(如度、聚类系数和配分系数分布)提供了从要迭代的初始生成测度的参数导出的解析表达式。生成测度的最佳参数是通过简单的模拟退火过程确定的。因此,本工作为来自不同领域(如生物学、计算机科学、生物学或复杂系统)的研究人员提供了一个工具,使他们能够创建其网络数据的通用模型。