School of Software Engineering, Beijing Jiaotong University, Beijing, 100044, China.
National Research Center of Railway Safety Assessment, Beijing Jiaotong University, Beijing, 100044, China.
Sci Data. 2022 May 27;9(1):244. doi: 10.1038/s41597-022-01349-8.
High-speed train operation data are reliable and rich resources in data-driven research. However, the data released by railway companies are poorly organized and not comprehensive enough to be applied directly and effectively. A public high-speed railway network dataset suitable for research is still lacking. To support the research in large-scale complex network, complex dynamic system and intelligent transportation, we develop a high-speed railway network dataset, containing the train operation data in different directions from October 8, 2019 to January 27, 2020, the train delay data of the railway stations, the junction stations data, and the mileage data of adjacent stations. In the dataset, weather, temperature, wind power and major holidays are considered as factors affecting train operation. Potential research values of the dataset include but are not limited to complex dynamic system pattern mining, community detection and discovery, and train delay analysis. Besides, the dataset can be used to solve various railway operation and management problems, such as passenger service network improvement, train real-time dispatching and intelligent driving assistance.
高铁运行数据是数据驱动研究中可靠且丰富的资源。然而,铁路公司发布的数据组织较差,不够全面,无法直接有效应用。缺少适用于研究的公共高速铁路网络数据集。为了支持大规模复杂网络、复杂动态系统和智能交通领域的研究,我们开发了一个高速铁路网络数据集,包含了 2019 年 10 月 8 日至 2020 年 1 月 27 日不同方向的列车运行数据、火车站的列车延误数据、交汇站数据以及相邻车站的里程数据。在数据集中,天气、温度、风力和主要节假日等因素被认为会影响列车运行。该数据集的潜在研究价值包括但不限于复杂动态系统模式挖掘、社区检测和发现以及列车延误分析。此外,该数据集可用于解决各种铁路运营和管理问题,例如改善客运服务网络、实时调度列车和智能驾驶辅助。