Nguyen Huu Nam, Tran Quoc Thanh, Ngo Canh Tung, Nguyen Duc Dam, Tran Van Quan
Institute for Hydropower and Renewable Energy, Vietnam Academy for Water Resources, Hanoi, Vietnam.
Hydraulic Construction Institute, Vietnam Academy for Water Resources, Hanoi, Vietnam.
PLoS One. 2025 Jan 2;20(1):e0315955. doi: 10.1371/journal.pone.0315955. eCollection 2025.
Solar energy generated from photovoltaic panel is an important energy source that brings many benefits to people and the environment. This is a growing trend globally and plays an increasingly important role in the future of the energy industry. However, it intermittent nature and potential for distributed system use require accurate forecasting to balance supply and demand, optimize energy storage, and manage grid stability. In this study, 5 machine learning models were used including: Gradient Boosting Regressor (GB), XGB Regressor (XGBoost), K-neighbors Regressor (KNN), LGBM Regressor (LightGBM), and CatBoost Regressor (CatBoost). Leveraging a dataset of 21045 samples, factors like Humidity, Ambient temperature, Wind speed, Visibility, Cloud ceiling and Pressure serve as inputs for constructing these machine learning models in forecasting solar energy. Model accuracy is meticulously assessed and juxtaposed using metrics such as coefficient of determination (R2), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE). The results show that the CatBoost model emerges as the frontrunner in predicting solar energy, with training values of R2 value of 0.608, RMSE of 4.478 W and MAE of 3.367 W and the testing value is R2 of 0.46, RMSE of 4.748 W and MAE of 3.583 W. SHAP analysis reveal that ambient temperature and humidity have the greatest influences on the value solar energy generated from photovoltaic panel.
光伏板产生的太阳能是一种重要的能源,给人类和环境带来诸多益处。这在全球范围内是一种不断发展的趋势,并且在能源行业的未来发挥着越来越重要的作用。然而,其间歇性特点以及在分布式系统中使用的可能性要求进行精确预测,以平衡供需、优化储能并管理电网稳定性。在本研究中,使用了5种机器学习模型,包括:梯度提升回归器(GB)、XGB回归器(XGBoost)、K近邻回归器(KNN)、轻梯度提升机回归器(LightGBM)和CatBoost回归器(CatBoost)。利用一个包含21045个样本的数据集,湿度、环境温度、风速、能见度、云底高度和气压等因素作为输入,用于构建这些预测太阳能的机器学习模型。使用决定系数(R2)、均方根误差(RMSE)和平均绝对误差(MAE)等指标对模型准确性进行了细致评估和比较。结果表明,CatBoost模型在太阳能预测中表现最佳,其训练值的R2为0.608,RMSE为4.478瓦,MAE为3.367瓦,测试值的R2为0.46,RMSE为4.748瓦,MAE为3.583瓦。SHAP分析表明,环境温度和湿度对光伏板产生的太阳能值影响最大。