Cecílio José, Rodrigues Tiago, Barros Márcia, Oliveira de Sá Alan
LASIGE, University of Lisbon, Lisbon, Portugal.
Sci Data. 2025 Mar 22;12(1):479. doi: 10.1038/s41597-025-04750-1.
This paper presents a novel and extensive dataset featuring comprehensive cross-sectional data from 13 households with nearly three years of electrical load, energy cost, and on-premises solar energy production directly linked to solar irradiation and weather parameters (SHEERM dataset). The dataset is essential for understanding and optimizing energy utilization to achieve Sustainable Development Goals (SDG) 7, 9, 11 and 13. It provides data about solar energy production, weather conditions, residential energy needs, and market prices. The combination of these variables facilitates multifaceted analysis, fostering advancements in renewable energy forecasting, climate-sensitive environments, grid management, and energy policy formulation. This paper details the data collection process, including the sources and methodologies employed. Following established literature, we developed and implemented machine learning models that comprehensively validate the data. Furthermore, as usage notes, we offer additional results by applying machine-learning approaches to the provided data. This dataset aims to help design new energy systems that enhance sustainable energy strategies and demonstrate their potential to accelerate the transition toward renewable energy and carbon neutrality.
本文展示了一个新颖且丰富的数据集,该数据集包含来自13个家庭的全面横截面数据,涵盖了近三年的电力负荷、能源成本以及与太阳辐射和天气参数直接相关的现场太阳能发电量(SHEERM数据集)。该数据集对于理解和优化能源利用以实现可持续发展目标(SDG)7、9、11和13至关重要。它提供了有关太阳能发电量、天气状况、住宅能源需求和市场价格的数据。这些变量的组合便于进行多方面分析,推动可再生能源预测、气候敏感环境、电网管理和能源政策制定等方面的进展。本文详细介绍了数据收集过程,包括所采用的来源和方法。参照已有的文献,我们开发并实施了全面验证数据的机器学习模型。此外,作为使用说明,我们通过将机器学习方法应用于所提供的数据给出了更多结果。该数据集旨在帮助设计新的能源系统,以增强可持续能源战略,并展示其加速向可再生能源和碳中和转型的潜力。