Gutiérrez Mlot Esteban Damián, Saldana Jose, Rodríguez Ricardo J, Kotsiuba Igor, Gañán Carlos
CIRCE Technology Center, Zaragoza, Spain.
Aragón Institute for Engineering Research, University of Zaragoza, Zaragoza, Spain.
Data Brief. 2024 Nov 20;57:111153. doi: 10.1016/j.dib.2024.111153. eCollection 2024 Dec.
The growing integration of Information and Communication Technology into Operational Technology environments in electrical substations exposes them to new cybersecurity threats. This paper presents a comprehensive dataset of substation traffic, aimed at improving the training and benchmarking of Intrusion Detection Systems (IDS) installed in these facilities that are based on machine learning techniques. The dataset includes raw network captures and flows from real substations, filtered and anonymized to ensure privacy. It covers the main protocols and standards used in substation environments: IEC61850, IEC104, NTP, and PTP. Additionally, the dataset includes traces obtained during several cyberattacks, which were simulated in a controlled laboratory environment, providing a rich resource for developing and testing machine learning models for cybersecurity applications in substations. A set of complementary tools for dataset creation and preprocessing are also included to standardize the methodology, ensuring consistency and reproducibility. In summary, the dataset addresses the critical need for high-quality, targeted data for tuning IDS at electrical substations and contributes to the advancement of secure and reliable power distribution networks.
信息通信技术日益融入变电站的运行技术环境,使它们面临新的网络安全威胁。本文展示了一个全面的变电站流量数据集,旨在改进这些基于机器学习技术的设施中安装的入侵检测系统(IDS)的训练和基准测试。该数据集包括来自真实变电站的原始网络捕获数据和流量,经过过滤和匿名处理以确保隐私。它涵盖了变电站环境中使用的主要协议和标准:IEC61850、IEC104、NTP和PTP。此外,该数据集还包括在受控实验室环境中模拟的几次网络攻击期间获得的痕迹,为开发和测试变电站网络安全应用的机器学习模型提供了丰富的资源。还包括一组用于数据集创建和预处理的补充工具,以规范方法,确保一致性和可重复性。总之,该数据集满足了为变电站调谐IDS提供高质量、有针对性数据的迫切需求,并有助于推进安全可靠的配电网发展。