Li Yuanlong, Wen Yonggang, Tao Dacheng, Guan Kyle
IEEE Trans Cybern. 2020 May;50(5):2002-2013. doi: 10.1109/TCYB.2019.2927410. Epub 2019 Jul 25.
Data center (DC) plays an important role to support services, such as e-commerce and cloud computing. The resulting energy consumption from this growing market has drawn significant attention, and noticeably almost half of the energy cost is used to cool the DC to a particular temperature. It is thus an critical operational challenge to curb the cooling energy cost without sacrificing the thermal safety of a DC. The existing solutions typically follow a two-step approach, in which the system is first modeled based on expert knowledge and, thus, the operational actions are determined with heuristics and/or best practices. These approaches are often hard to generalize and might result in suboptimal performances due to intrinsic model errors for large-scale systems. In this paper, we propose optimizing the DC cooling control via the emerging deep reinforcement learning (DRL) framework. Compared to the existing approaches, our solution lends itself an end-to-end cooling control algorithm (CCA) via an off-policy offline version of the deep deterministic policy gradient (DDPG) algorithm, in which an evaluation network is trained to predict the DC energy cost along with resulting cooling effects, and a policy network is trained to gauge optimized control settings. Moreover, we introduce a de-underestimation (DUE) validation mechanism for the critic network to reduce the potential underestimation of the risk caused by neural approximation. Our proposed algorithm is evaluated on an EnergyPlus simulation platform and on a real data trace collected from the National Super Computing Centre (NSCC) of Singapore. The resulting numerical results show that the proposed CCA can achieve up to 11% cooling cost reduction on the simulation platform compared with a manually configured baseline control algorithm. In the trace-based study of conservative nature, the proposed algorithm can achieve about 15% cooling energy savings on the NSCC data trace. Our pioneering approach can shed new light on the application of DRL to optimize and automate DC operations and management, potentially revolutionizing digital infrastructure management with intelligence.
数据中心(DC)在支持电子商务和云计算等服务方面发挥着重要作用。这个不断发展的市场所产生的能源消耗引起了广泛关注,值得注意的是,几乎一半的能源成本用于将数据中心冷却到特定温度。因此,在不牺牲数据中心热安全性的前提下降低冷却能源成本是一项关键的运营挑战。现有的解决方案通常采用两步法,首先基于专家知识对系统进行建模,然后通过启发式方法和/或最佳实践来确定操作行动。这些方法往往难以推广,并且由于大规模系统的内在模型误差,可能导致性能次优。在本文中,我们提出通过新兴的深度强化学习(DRL)框架来优化数据中心的冷却控制。与现有方法相比,我们的解决方案通过深度确定性策略梯度(DDPG)算法的离策略离线版本提供了一种端到端的冷却控制算法(CCA),其中训练一个评估网络来预测数据中心的能源成本以及由此产生的冷却效果,训练一个策略网络来衡量优化的控制设置。此外,我们为评论家网络引入了一种去低估(DUE)验证机制,以减少由神经近似引起的风险潜在低估。我们提出的算法在EnergyPlus模拟平台和从新加坡国家超级计算中心(NSCC)收集的真实数据轨迹上进行了评估。所得数值结果表明,与手动配置的基线控制算法相比,所提出的CCA在模拟平台上可实现高达11%的冷却成本降低。在基于轨迹的保守性质研究中,所提出的算法在NSCC数据轨迹上可实现约15%的冷却能源节省。我们的开创性方法可以为DRL在优化和自动化数据中心运营与管理方面的应用提供新的思路,有可能以智能方式彻底改变数字基础设施管理。