Zhao Ying, Zou Xianzhe, Wang Xiao, Yang Zhanpeng, Wang Xuan, Zhao Xin, Zhang Ning, Huang Xin, Zhou Fangfang
School of Computer Science and Engineering, Central South University, Changsha, China.
Qi An Xin Technology Group Inc., Layer Platform, Beijing, China.
Sci Data. 2025 May 28;12(1):898. doi: 10.1038/s41597-025-05077-7.
Graph-related technologies, including social networks, transportation systems, and bioinformatics, are continually evolving in various application domains. The advancement of these technologies often relies on high-quality graph datasets for validating performance, such as scalability and time/space complexity. However, existing datasets are typically categorized by domains or types, lacking an explicit organization by scales and a wide range of scale levels. This situation may hinder comprehensive performance validations. This paper introduces an open graph dataset organized by scales named OGDOS. The dataset encompasses 470 preset scale levels, covering node counts from 100 to 200,000 and edge-to-node ratios from 1 to 10. The dataset combines scale-aligned real-world graphs and synthetic graphs, offering a versatile resource for evaluating various graph-related technologies. This paper also presents the OGDOS's construction process, provides a technical validation, and discusses its limitations.
包括社交网络、交通系统和生物信息学在内的与图相关的技术,正在各个应用领域不断发展。这些技术的进步通常依赖于高质量的图数据集来验证性能,比如可扩展性和时间/空间复杂度。然而,现有的数据集通常按领域或类型分类,缺乏按规模进行的明确组织以及广泛的规模级别。这种情况可能会阻碍全面的性能验证。本文介绍了一个按规模组织的开放图数据集,名为OGDOS。该数据集包含470个预设规模级别,涵盖从100到200,000的节点数以及从1到10的边与节点比率。该数据集结合了与规模对齐的真实世界图和合成图,为评估各种与图相关的技术提供了一种通用资源。本文还介绍了OGDOS的构建过程,进行了技术验证,并讨论了其局限性。