Suppr超能文献

结合数据聚类和降维的学习模型用于短期电力负荷预测。

Learning model combined with data clustering and dimensionality reduction for short-term electricity load forecasting.

作者信息

Bae Hyun-Jung, Park Jong-Seong, Choi Ji-Hyeok, Kwon Hyuk-Yoon

机构信息

Graduate School of Data Science, Seoul National University of Science and Technology, Seoul, South Korea.

Department of Industrial Engineering, Seoul National University of Science and Technology, Seoul, South Korea.

出版信息

Sci Rep. 2025 Jan 28;15(1):3575. doi: 10.1038/s41598-025-86982-0.

Abstract

Electric load forecasting is crucial in the planning and operating electric power companies. It has evolved from statistical methods to artificial intelligence-based techniques that use machine learning models. In this study, we investigate short-term load forecasting (STLF) for large-scale electricity usage datasets. We propose a new prediction model for STLF that combines data clustering and dimensionality reduction schemes to handle large-scale electricity usage data effectively. Here, we adapt k-means clustering for data clustering, kernel principal component analysis (kernel PCA), universal manifold approximation and projection (UMAP), and t-stochastic nearest neighbor (t-SNE) for dimensionality reduction. To verify the effectiveness of the proposed model, we extensively apply it to neural network-based models. We compare and analyze the performance of the proposed model with the comparisons using actual electricity usage data for 4710 households. Experimental results demonstrate that data clustering with dimensionality reduction can improve the performance of baseline models. As a result, the prediction accuracy of the proposed method outperforms those of the existing methods by 1.01-1.76 times for summer data and by 1.03-1.36 times for winter data in terms of mean absolute percentage error (MAPE).

摘要

电力负荷预测对于电力公司的规划和运营至关重要。它已从统计方法发展到使用机器学习模型的基于人工智能的技术。在本研究中,我们针对大规模用电数据集研究短期负荷预测(STLF)。我们提出了一种新的STLF预测模型,该模型结合了数据聚类和降维方案,以有效处理大规模用电数据。在这里,我们采用k均值聚类进行数据聚类,采用核主成分分析(kernel PCA)、通用流形逼近与投影(UMAP)以及t随机最近邻(t-SNE)进行降维。为了验证所提模型的有效性,我们将其广泛应用于基于神经网络的模型。我们使用4710户家庭的实际用电数据进行比较,对比并分析了所提模型的性能。实验结果表明,结合降维的数据聚类可以提高基线模型的性能。结果,在所提方法的预测精度方面,就平均绝对百分比误差(MAPE)而言,夏季数据比现有方法高出1.01至1.76倍,冬季数据比现有方法高出1.03至1.36倍。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc05/11775257/1c2a96e75d89/41598_2025_86982_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验