Costa Tatiane, Falcão Bruno, Mohamed Mohamed A, Annuk Andres, Marinho Manoel
Polytechnic School of Engineering (POLI-UPE), Postgraduate Program in Systems Engineering, University of Pernambuco (UPE), Recife, Brazil.
Department of Electrical Engineering, Faculty of Engineering, Minia University, Minia, 61519, Egypt.
Sci Rep. 2024 Oct 11;14(1):23801. doi: 10.1038/s41598-024-74342-3.
This research evaluates the application of advanced machine learning algorithms, specifically Random Forest and Gradient Boosting, for the imputation of missing data in solar energy generation databases and their impact on the size of green hydrogen production systems. The study demonstrates that the Random Forest model notably excels in harnessing solar data to optimize hydrogen production, achieving superior prediction accuracy with mean absolute error (MAE) of 0.0364, mean squared error (MSE) of 0.0097, root mean squared error (RMSE) of 0.0985, and a coefficient of determination (R) of 0.9779. These metrics surpass those obtained from baseline models including linear regression and recurrent neural networks, highlighting the potential of accurate imputation to significantly enhance the efficiency and output of renewable energy systems. The findings advocate for the integration of robust data imputation methods in the design and operation of photovoltaic systems, contributing to the reliability and sustainability of energy resource management. Furthermore, this research makes significant contributions by showcasing the comparative performance of traditional machine learning models in handling data gaps, emphasizing the practical implications of data imputation on optimizing hydrogen production systems. By providing a detailed analysis and validation of the imputation models, this work offers valuable insights for future advancements in renewable energy technology.
本研究评估了先进的机器学习算法,特别是随机森林和梯度提升算法,在太阳能发电数据库中缺失数据插补方面的应用及其对绿色氢生产系统规模的影响。研究表明,随机森林模型在利用太阳能数据优化氢气生产方面表现卓越,平均绝对误差(MAE)为0.0364,均方误差(MSE)为0.0097,均方根误差(RMSE)为0.0985,决定系数(R)为0.9779,实现了卓越的预测准确性。这些指标超过了包括线性回归和递归神经网络在内的基线模型所获得的指标,凸显了准确插补对显著提高可再生能源系统效率和产量的潜力。研究结果主张在光伏系统的设计和运行中整合强大的数据插补方法,有助于能源资源管理的可靠性和可持续性。此外,本研究通过展示传统机器学习模型在处理数据缺口方面的比较性能做出了重大贡献,强调了数据插补对优化氢气生产系统的实际意义。通过对插补模型进行详细分析和验证,这项工作为可再生能源技术的未来发展提供了有价值的见解。