Zheng Xiangtian, Xu Nan, Trinh Loc, Wu Dongqi, Huang Tong, Sivaranjani S, Liu Yan, Xie Le
Texas A&M University, Department of Electrical and Computer Engineering, College Station, 77840, USA.
University of Southern California, Computer Science Department, Los Angeles, 90007, USA.
Sci Data. 2022 Jun 22;9(1):359. doi: 10.1038/s41597-022-01455-7.
The electric grid is a key enabling infrastructure for the ambitious transition towards carbon neutrality as we grapple with climate change. With deepening penetration of renewable resources, the reliable operation of the electric grid becomes increasingly challenging. In this paper, we present PSML, a first-of-its-kind open-access multi-scale time-series dataset, to aid in the development of data-driven machine learning (ML)-based approaches towards reliable operation of future electric grids. The dataset is synthesized from a joint transmission and distribution electric grid to capture the increasingly important interactions and uncertainties of the grid dynamics, containing power, voltage and current measurements over multiple spatio-temporal scales. Using PSML, we provide state-of-the-art ML benchmarks on three challenging use cases of critical importance to achieve: (i) early detection, accurate classification and localization of dynamic disturbances; (ii) robust hierarchical forecasting of load and renewable energy; and (iii) realistic synthetic generation of physical-law-constrained measurements. We envision that this dataset will provide use-inspired ML research in safety-critical systems, while simultaneously enabling ML researchers to contribute towards decarbonization of energy sectors.
在应对气候变化的过程中,电网是实现雄心勃勃的碳中和转型的关键支撑性基础设施。随着可再生资源渗透率的不断提高,电网的可靠运行面临着越来越大的挑战。在本文中,我们提出了PSML,这是首个开放获取的多尺度时间序列数据集,以助力基于数据驱动的机器学习(ML)方法的开发,实现未来电网的可靠运行。该数据集由输配电联合电网合成,以捕捉电网动态中日益重要的相互作用和不确定性,包含多个时空尺度上的功率、电压和电流测量值。使用PSML,我们在三个至关重要的具有挑战性的用例上提供了最先进的ML基准:(i)动态干扰的早期检测、准确分类和定位;(ii)负荷和可再生能源的稳健分层预测;(iii)物理定律约束测量的逼真合成生成。我们设想,该数据集将为安全关键系统中受应用启发的ML研究提供支持,同时使ML研究人员能够为能源部门的脱碳做出贡献。