Ma Wenheng, Cheng Qiao, Gao Yudi, Xu Lan, Yu Ningmei
Faculty of Automation and Information Engineering, Xi'an University of Technology, Xi'an 710048, China.
Micromachines (Basel). 2021 Mar 10;12(3):292. doi: 10.3390/mi12030292.
Embedded processors are widely used in various systems working on different tasks with different workloads. A more complex micro-architecture leads to better peak performance and worse power consumption. Shutting down the units designed for performance enhancement could improve energy efficiency in low-workload scenarios. In this paper, we evaluated the energy distribution in various embedded processors. According to the analysis, pipeline registers and the dynamic branch predictor, which are employed for better peak performance, have great impacts on energy efficiency. Thus, we proposed an ultra-low-power processor with variable micro-architecture. The processor is based on a 4-stage pipeline core with a Gshare branch predictor, and all units work in high-performance mode. In normal mode, the Gshare predictor is shut down and Always-Not-Taken prediction is used. In low-power mode, some of the pipeline registers are bypassed to avoid unnecessary energy dissipation and improve executing efficiency. A mode register (MR) is designed to indicate current working mode. Switching between different modes is controlled by the software. The proposed core is implemented in 40 nm technology and simulated with the traces of 17 benchmarks in Embench. The average amounts of power consumed by the respective modes are 41.7 μW, 59.7 μW and 71.1 μW. The results show that normal mode (N-mode) and low-power mode (L-mode) consume 16.08% and 41.37% less power than high-performance mode (H-mode) on average. In best case scenarios, they could save 25.36% and 49.30% more power than H-mode. Considering the execution efficiency evaluated by instructions per cycle (IPC), the proposed processor consumes 7.78% or 51.57% less energy for each instruction than the baseline core. The area of the proposed processor is only 7.19% larger than the baseline core, and only 3.08% more power is consumed in H-mode.
嵌入式处理器广泛应用于各种执行不同任务、具有不同工作负载的系统中。更复杂的微架构会带来更好的峰值性能,但功耗会更差。关闭为提高性能而设计的单元可以在低工作负载场景下提高能源效率。在本文中,我们评估了各种嵌入式处理器中的能量分布。根据分析,为实现更好的峰值性能而采用的流水线寄存器和动态分支预测器对能源效率有很大影响。因此,我们提出了一种具有可变微架构的超低功耗处理器。该处理器基于一个带有Gshare分支预测器的4级流水线核心,所有单元都工作在高性能模式。在正常模式下,Gshare预测器被关闭,并使用总是不采取预测。在低功耗模式下,一些流水线寄存器被旁路,以避免不必要的能量消耗并提高执行效率。设计了一个模式寄存器(MR)来指示当前的工作模式。不同模式之间的切换由软件控制。所提出的核心采用40纳米技术实现,并使用Embench中17个基准测试的跟踪进行模拟。各个模式下的平均功耗分别为41.7微瓦、59.7微瓦和71.1微瓦。结果表明,正常模式(N模式)和低功耗模式(L模式)平均比高性能模式(H模式)分别少消耗16.08%和41.37%的功率。在最佳情况下,它们比H模式可多节省25.36%和49.30%的功率。考虑到通过每周期指令数(IPC)评估的执行效率,所提出的处理器每条指令消耗的能量比基线核心少7.78%或51.57%。所提出处理器的面积仅比基线核心大7.19%,在H模式下仅多消耗3.08%的功率。