基于模仿学习的多核系统性能-功耗权衡非核频率缩放策略。

Imitation Learning-Based Performance-Power Trade-Off Uncore Frequency Scaling Policy for Multicore System.

机构信息

School of Electronic Information, Wuhan University, Wuhan 430072, China.

出版信息

Sensors (Basel). 2023 Jan 28;23(3):1449. doi: 10.3390/s23031449.

DOI:10.3390/s23031449

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9920788/

Abstract

As the importance of uncore components, such as shared cache slices and memory controllers, increases in processor architecture, the percentage of uncore power consumption in the overall power consumption of multicore processors rises significantly. To maximize the power efficiency of a multicore processor system, we investigate the uncore frequency scaling (UFS) policy and propose a novel imitation learning-based uncore frequency control policy. This policy performs online learning based on the DAgger algorithm and converts the annotation cost of online aggregation data into fine-tuning of the expert model. This design optimizes the online learning efficiency and improves the generality of the UFS policy on unseen loads. On the other hand, we shift our policy optimization target to Performance Per Watt (PPW), i.e., the power efficiency of the processor, to avoid saving a percentage of power while losing a larger percentage of performance. The experimental results show that our proposed policy outperforms the current advanced UFS policy in the benchmark test sequence of SPEC CPU2017. Our policy has a maximum improvement of about 10% relative to the performance-first policies. In the unseen processor load, the tuning decision made by our policy after collecting 50 aggregation data can maintain the processor stably near the optimal power efficiency state.

摘要

随着处理器架构中核心组件（如共享缓存片和内存控制器）的重要性不断增加，多核处理器的总功耗中核心组件的功耗占比显著上升。为了最大限度地提高多核处理器系统的功率效率，我们研究了非核心频率缩放（UFS）策略，并提出了一种新的基于模仿学习的非核心频率控制策略。该策略基于 DAgger 算法进行在线学习，并将在线聚合数据的注释成本转换为专家模型的微调。这种设计优化了在线学习效率，并提高了 UFS 策略在未见负载下的通用性。另一方面，我们将策略优化目标转移到 Performance Per Watt (PPW)，即处理器的功率效率，以避免在节省一定百分比功率的同时损失更大百分比的性能。实验结果表明，我们提出的策略在 SPEC CPU2017 的基准测试序列中优于当前先进的 UFS 策略。与性能优先策略相比，我们的策略最大可提高约 10%。在未见的处理器负载下，我们的策略在收集 50 个聚合数据后做出的调整决策可以使处理器稳定地保持在最优功率效率状态附近。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d033/9920788/7a933c934091/sensors-23-01449-g001.jpg

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。

相似文献

1

Imitation Learning-Based Performance-Power Trade-Off Uncore Frequency Scaling Policy for Multicore System.基于模仿学习的多核系统性能-功耗权衡非核频率缩放策略。

Sensors (Basel). 2023 Jan 28;23(3):1449. doi: 10.3390/s23031449.

2

Speedup bioinformatics applications on multicore-based processor using vectorizing and multithreading strategies.使用向量化和多线程策略加速基于多核处理器的生物信息学应用程序。

Bioinformation. 2007 Dec 30;2(5):182-4. doi: 10.6026/97320630002182.

3

Performance aware shared memory hierarchy model for multicore processors.面向多核处理器的性能感知共享存储层次结构模型。

Sci Rep. 2023 May 5;13(1):7313. doi: 10.1038/s41598-023-34297-3.

4

A high performance load balance strategy for real-time multicore systems.一种用于实时多核系统的高性能负载均衡策略。

ScientificWorldJournal. 2014;2014:101529. doi: 10.1155/2014/101529. Epub 2014 Apr 14.

5

An Ultra-Low-Power Embedded Processor with Variable Micro-Architecture.一种具有可变微架构的超低功耗嵌入式处理器。

Micromachines (Basel). 2021 Mar 10;12(3):292. doi: 10.3390/mi12030292.

6

Generative Upper-Level Policy Imitation Learning With Pareto-Improvement for Energy-Efficient Advanced Machining Systems.用于节能先进加工系统的具有帕累托改进的生成式高层策略模仿学习

IEEE Trans Neural Netw Learn Syst. 2025 Mar;36(3):5190-5203. doi: 10.1109/TNNLS.2024.3372641. Epub 2025 Feb 28.

7

Cooperative and Competitive Reinforcement and Imitation Learning for a Mixture of Heterogeneous Learning Modules.用于异构学习模块混合的合作与竞争强化及模仿学习

Front Neurorobot. 2018 Sep 27;12:61. doi: 10.3389/fnbot.2018.00061. eCollection 2018.

8

Memory hierarchy characterization of SPEC CPU2006 and SPEC CPU2017 on the Intel Xeon Skylake-SP.SPEC CPU2006 和 SPEC CPU2017 在英特尔至强 Skylake-SP 上的内存层次结构特征。

PLoS One. 2019 Aug 1;14(8):e0220135. doi: 10.1371/journal.pone.0220135. eCollection 2019.

9

Evaluating architecture impact on system energy efficiency.评估架构对系统能源效率的影响。

PLoS One. 2017 Nov 21;12(11):e0188428. doi: 10.1371/journal.pone.0188428. eCollection 2017.

10

Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures.基于隐马尔可夫模型的非易失性存储架构的片外预取。

PLoS One. 2021 Sep 14;16(9):e0257047. doi: 10.1371/journal.pone.0257047. eCollection 2021.

本文引用的文献

1

A Near-Optimal Energy Management Mechanism Considering QoS and Fairness Requirements in Tree Structure Wireless Sensor Networks.树状结构无线传感器网络中考虑服务质量和公平性要求的近最优能量管理机制。

Sensors (Basel). 2023 Jan 9;23(2):763. doi: 10.3390/s23020763.

2

An Energy Efficient Load Balancing Tree-Based Data Aggregation Scheme for Grid-Based Wireless Sensor Networks.基于网格的无线传感器网络中的一种节能负载均衡树状数据聚合方案。

Sensors (Basel). 2022 Nov 29;22(23):9303. doi: 10.3390/s22239303.

3

BIOS-Based Server Intelligent Optimization.基于生物特征的服务器智能优化。

Sensors (Basel). 2022 Sep 6;22(18):6730. doi: 10.3390/s22186730.

4

Is imitation learning the route to humanoid robots?模仿学习是类人机器人的发展途径吗？

Trends Cogn Sci. 1999 Jun;3(6):233-242. doi: 10.1016/s1364-6613(99)01327-3.