IEEE Trans Cybern. 2023 Jul;53(7):4292-4305. doi: 10.1109/TCYB.2022.3165074. Epub 2023 Jun 15.
An efficient energy scheduling strategy of a charging station is crucial for stabilizing the electricity market and accommodating the charging demand of electric vehicles (EVs). Most of the existing studies on energy scheduling strategies fail to coordinate the process of energy purchasing and distribution and, thus, cannot balance the energy supply and demand. Besides, the existence of multiple charging stations in a complex scenario makes it difficult to develop a unified schedule strategy for different charging stations. In order to solve these problems, we propose a multiagent reinforcement learning (MARL) method to learn the optimal energy purchasing strategy and an online heuristic dispatching scheme to develop a energy distribution strategy in this article. Unlike the traditional scheduling methods, the two proposed strategies are coordinated with each other in both temporal and spatial dimensions to develop the unified energy scheduling strategy for charging stations. Specifically, the proposed MARL method combines the multiagent deep deterministic policy gradient (MADDPG) principles for learning purchasing strategy and a long short-term memory (LSTM) neural network for predicting the charging demand of EVs. Moreover, a multistep reward function is developed to accelerate the learning process. The proposed method is verified by comprehensive simulation experiments based on real data of the electricity market in Chicago. The experiment results show that the proposed method can achieve better performance than other state-of-the-art energy scheduling methods in the charging market in terms of the economic profits and users' satisfaction ratio.
一种高效的充电站能源调度策略对于稳定电力市场和适应电动汽车(EV)的充电需求至关重要。大多数现有的能源调度策略研究未能协调能源购买和分配过程,因此无法平衡能源供需。此外,在复杂场景中存在多个充电站,使得为不同的充电站制定统一的调度策略变得困难。为了解决这些问题,我们提出了一种多智能体强化学习(MARL)方法来学习最优的能源购买策略,并提出了一种在线启发式调度方案来制定能源分配策略。与传统的调度方法不同,这两种提出的策略在时间和空间维度上相互协调,以制定充电站的统一能源调度策略。具体来说,所提出的 MARL 方法结合了多智能体深度确定性策略梯度(MADDPG)原理来学习购买策略和长短期记忆(LSTM)神经网络来预测电动汽车的充电需求。此外,还开发了一个多步奖励函数来加速学习过程。所提出的方法通过基于芝加哥电力市场实际数据的综合仿真实验得到了验证。实验结果表明,在所提出的充电市场中,与其他最先进的能源调度方法相比,所提出的方法在经济利润和用户满意度方面都能取得更好的性能。