通过深度强化学习实现绿色数据中心的冷却优化转型

Li Yuanlong, Wen Yonggang, Tao Dacheng, Guan Kyle

IEEE Trans Cybern. 2020 May;50(5):2002-2013. doi: 10.1109/TCYB.2019.2927410. Epub 2019 Jul 25.

Data center (DC) plays an important role to support services, such as e-commerce and cloud computing. The resulting energy consumption from this growing market has drawn significant attention, and noticeably almost half of the energy cost is used to cool the DC to a particular temperature. It is thus an critical operational challenge to curb the cooling energy cost without sacrificing the thermal safety of a DC. The existing solutions typically follow a two-step approach, in which the system is first modeled based on expert knowledge and, thus, the operational actions are determined with heuristics and/or best practices. These approaches are often hard to generalize and might result in suboptimal performances due to intrinsic model errors for large-scale systems. In this paper, we propose optimizing the DC cooling control via the emerging deep reinforcement learning (DRL) framework. Compared to the existing approaches, our solution lends itself an end-to-end cooling control algorithm (CCA) via an off-policy offline version of the deep deterministic policy gradient (DDPG) algorithm, in which an evaluation network is trained to predict the DC energy cost along with resulting cooling effects, and a policy network is trained to gauge optimized control settings. Moreover, we introduce a de-underestimation (DUE) validation mechanism for the critic network to reduce the potential underestimation of the risk caused by neural approximation. Our proposed algorithm is evaluated on an EnergyPlus simulation platform and on a real data trace collected from the National Super Computing Centre (NSCC) of Singapore. The resulting numerical results show that the proposed CCA can achieve up to 11% cooling cost reduction on the simulation platform compared with a manually configured baseline control algorithm. In the trace-based study of conservative nature, the proposed algorithm can achieve about 15% cooling energy savings on the NSCC data trace. Our pioneering approach can shed new light on the application of DRL to optimize and automate DC operations and management, potentially revolutionizing digital infrastructure management with intelligence.

数据中心（DC）在支持电子商务和云计算等服务方面发挥着重要作用。这个不断发展的市场所产生的能源消耗引起了广泛关注，值得注意的是，几乎一半的能源成本用于将数据中心冷却到特定温度。因此，在不牺牲数据中心热安全性的前提下降低冷却能源成本是一项关键的运营挑战。现有的解决方案通常采用两步法，首先基于专家知识对系统进行建模，然后通过启发式方法和/或最佳实践来确定操作行动。这些方法往往难以推广，并且由于大规模系统的内在模型误差，可能导致性能次优。在本文中，我们提出通过新兴的深度强化学习（DRL）框架来优化数据中心的冷却控制。与现有方法相比，我们的解决方案通过深度确定性策略梯度（DDPG）算法的离策略离线版本提供了一种端到端的冷却控制算法（CCA），其中训练一个评估网络来预测数据中心的能源成本以及由此产生的冷却效果，训练一个策略网络来衡量优化的控制设置。此外，我们为评论家网络引入了一种去低估（DUE）验证机制，以减少由神经近似引起的风险潜在低估。我们提出的算法在EnergyPlus模拟平台和从新加坡国家超级计算中心（NSCC）收集的真实数据轨迹上进行了评估。所得数值结果表明，与手动配置的基线控制算法相比，所提出的CCA在模拟平台上可实现高达11%的冷却成本降低。在基于轨迹的保守性质研究中，所提出的算法在NSCC数据轨迹上可实现约15%的冷却能源节省。我们的开创性方法可以为DRL在优化和自动化数据中心运营与管理方面的应用提供新的思路，有可能以智能方式彻底改变数字基础设施管理。

相似文献

Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning.

IEEE Trans Cybern. 2020 May;50(5):2002-2013. doi: 10.1109/TCYB.2019.2927410. Epub 2019 Jul 25.

Energy saving strategy of cloud data computing based on convolutional neural network and policy gradient algorithm.

PLoS One. 2022 Dec 30;17(12):e0279649. doi: 10.1371/journal.pone.0279649. eCollection 2022.

Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.

PLoS One. 2021 Jun 10;16(6):e0252754. doi: 10.1371/journal.pone.0252754. eCollection 2021.

Deep Reinforcement Learning Approach with Multiple Experience Pools for UAV's Autonomous Motion Planning in Complex Unknown Environments.

Sensors (Basel). 2020 Mar 29;20(7):1890. doi: 10.3390/s20071890.

Approximate Policy-Based Accelerated Deep Reinforcement Learning.

IEEE Trans Neural Netw Learn Syst. 2020 Jun;31(6):1820-1830. doi: 10.1109/TNNLS.2019.2927227. Epub 2019 Aug 6.

Deep deterministic policy gradient algorithm: A systematic review.

Heliyon. 2024 May 7;10(9):e30697. doi: 10.1016/j.heliyon.2024.e30697. eCollection 2024 May 15.

AQMDRL: Automatic Quality of Service Architecture Based on Multistep Deep Reinforcement Learning in Software-Defined Networking.

Sensors (Basel). 2022 Dec 30;23(1):429. doi: 10.3390/s23010429.

Optimization based resource and cooling management for a high performance computing data center.

ISA Trans. 2019 Jul;90:202-212. doi: 10.1016/j.isatra.2018.12.038. Epub 2019 Jan 4.

PORF-DDPG: Learning Personalized Autonomous Driving Behavior with Progressively Optimized Reward Function.

Sensors (Basel). 2020 Oct 1;20(19):5626. doi: 10.3390/s20195626.

Deep Deterministic Policy Gradient With Compatible Critic Network.

IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):4332-4344. doi: 10.1109/TNNLS.2021.3117790. Epub 2023 Aug 4.

引用本文的文献

How Beyond-5G and 6G Makes IIoT and the Smart Grid Green-A Survey.

Sensors (Basel). 2025 Jul 6;25(13):4222. doi: 10.3390/s25134222.

Heteroepitaxial Growth of InBi(001).

Molecules. 2024 Jun 13;29(12):2825. doi: 10.3390/molecules29122825.

Vision-Less Sensing for Autonomous Micro-Drones.

Sensors (Basel). 2021 Aug 5;21(16):5293. doi: 10.3390/s21165293.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning.

IEEE Trans Cybern. 2020 May;50(5):2002-2013. doi: 10.1109/TCYB.2019.2927410. Epub 2019 Jul 25.

Energy saving strategy of cloud data computing based on convolutional neural network and policy gradient algorithm.

PLoS One. 2022 Dec 30;17(12):e0279649. doi: 10.1371/journal.pone.0279649. eCollection 2022.

Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.

PLoS One. 2021 Jun 10;16(6):e0252754. doi: 10.1371/journal.pone.0252754. eCollection 2021.

Deep Reinforcement Learning Approach with Multiple Experience Pools for UAV's Autonomous Motion Planning in Complex Unknown Environments.

Sensors (Basel). 2020 Mar 29;20(7):1890. doi: 10.3390/s20071890.

Approximate Policy-Based Accelerated Deep Reinforcement Learning.

IEEE Trans Neural Netw Learn Syst. 2020 Jun;31(6):1820-1830. doi: 10.1109/TNNLS.2019.2927227. Epub 2019 Aug 6.

Deep deterministic policy gradient algorithm: A systematic review.

Heliyon. 2024 May 7;10(9):e30697. doi: 10.1016/j.heliyon.2024.e30697. eCollection 2024 May 15.

AQMDRL: Automatic Quality of Service Architecture Based on Multistep Deep Reinforcement Learning in Software-Defined Networking.

Sensors (Basel). 2022 Dec 30;23(1):429. doi: 10.3390/s23010429.

Optimization based resource and cooling management for a high performance computing data center.

ISA Trans. 2019 Jul;90:202-212. doi: 10.1016/j.isatra.2018.12.038. Epub 2019 Jan 4.

PORF-DDPG: Learning Personalized Autonomous Driving Behavior with Progressively Optimized Reward Function.

Sensors (Basel). 2020 Oct 1;20(19):5626. doi: 10.3390/s20195626.

Deep Deterministic Policy Gradient With Compatible Critic Network.

IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):4332-4344. doi: 10.1109/TNNLS.2021.3117790. Epub 2023 Aug 4.

引用本文的文献

How Beyond-5G and 6G Makes IIoT and the Smart Grid Green-A Survey.

Sensors (Basel). 2025 Jul 6;25(13):4222. doi: 10.3390/s25134222.

Heteroepitaxial Growth of InBi(001).

Molecules. 2024 Jun 13;29(12):2825. doi: 10.3390/molecules29122825.

Vision-Less Sensing for Autonomous Micro-Drones.

Sensors (Basel). 2021 Aug 5;21(16):5293. doi: 10.3390/s21165293.

Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献