用于深度神经网络推理中注意力计算的高性能方法与架构

Cheng Qi, Hu Xiaofang, Xiao He, Zhou Yue, Duan Shukai

IEEE Trans Biomed Circuits Syst. 2025 Apr;19(2):404-415. doi: 10.1109/TBCAS.2024.3436837. Epub 2025 Apr 2.

In recent years, The combination of Attention mechanism and deep learning has a wide range of applications in the field of medical imaging. However, due to its complex computational processes, existing hardware architectures have high resource consumption or low accuracy, and deploying them efficiently to DNN accelerators is a challenge. This paper proposes an online-programmable Attention hardware architecture based on compute-in-memory (CIM) marco, which reduces the complexity of Attention in hardware and improves integration density, energy efficiency, and calculation accuracy. First, the Attention computation process is decomposed into multiple cascaded combinatorial matrix operations to reduce the complexity of its implementation on the hardware side; second, in order to reduce the influence of the non-ideal characteristics of the hardware, an online-programmable CIM architecture is designed to improve calculation accuracy by dynamically adjusting the weights; and lastly, it is verified that the proposed Attention hardware architecture can be applied for the inference of deep neural networks through Spice simulation. Based on the 100nm CMOS process, compared with the traditional Attention hardware architectures, the integrated density and energy efficiency are increased by at least 91.38 times, and latency and computing efficiency are improved by about 12.5 times.

近年来，注意力机制与深度学习的结合在医学成像领域有着广泛的应用。然而，由于其计算过程复杂，现有的硬件架构资源消耗高或精度低，将它们高效地部署到深度神经网络（DNN）加速器上是一项挑战。本文提出了一种基于存内计算（CIM）架构的在线可编程注意力硬件架构，该架构降低了注意力机制在硬件中的复杂度，提高了集成密度、能源效率和计算精度。首先，将注意力计算过程分解为多个级联的组合矩阵运算，以降低其在硬件实现上的复杂度；其次，为了减少硬件非理想特性的影响，设计了一种在线可编程CIM架构，通过动态调整权重来提高计算精度；最后，通过Spice仿真验证了所提出的注意力硬件架构可应用于深度神经网络的推理。基于100nm CMOS工艺，与传统的注意力硬件架构相比，集成密度和能源效率提高了至少91.38倍，延迟和计算效率提高了约12.5倍。

相似文献

High-Performance Method and Architecture for Attention Computation in DNN Inference.

IEEE Trans Biomed Circuits Syst. 2025 Apr;19(2):404-415. doi: 10.1109/TBCAS.2024.3436837. Epub 2025 Apr 2.

Short-Term Memory Impairment

A review: Lightweight architecture model in deep learning approach for lung disease identification.

Comput Biol Med. 2025 Aug;194:110425. doi: 10.1016/j.compbiomed.2025.110425. Epub 2025 Jun 14.

Towards Hardware Supported Domain Generalization in DNN-Based Edge Computing Devices for Health Monitoring.

IEEE Trans Biomed Circuits Syst. 2025 Feb;19(1):5-15. doi: 10.1109/TBCAS.2024.3418085. Epub 2025 Feb 11.

Sexual Harassment and Prevention Training

Management of urinary stones by experts in stone disease (ESD 2025).

Arch Ital Urol Androl. 2025 Jun 30;97(2):14085. doi: 10.4081/aiua.2025.14085.

The Black Book of Psychotropic Dosing and Monitoring.

Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.

RVDLAHA: An RISC-V DLA Hardware Architecture for On-Device Real-Time Seizure Detection and Personalization in Wearable Applications.

IEEE Trans Biomed Circuits Syst. 2025 Feb;19(1):40-54. doi: 10.1109/TBCAS.2024.3442250. Epub 2025 Feb 11.

Home treatment for mental health problems: a systematic review.

Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.

Health professionals' experience of teamwork education in acute hospital settings: a systematic review of qualitative literature.

JBI Database System Rev Implement Rep. 2016 Apr;14(4):96-137. doi: 10.11124/JBISRIR-2016-1843.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

High-Performance Method and Architecture for Attention Computation in DNN Inference.

IEEE Trans Biomed Circuits Syst. 2025 Apr;19(2):404-415. doi: 10.1109/TBCAS.2024.3436837. Epub 2025 Apr 2.

Short-Term Memory Impairment

A review: Lightweight architecture model in deep learning approach for lung disease identification.

Comput Biol Med. 2025 Aug;194:110425. doi: 10.1016/j.compbiomed.2025.110425. Epub 2025 Jun 14.

Towards Hardware Supported Domain Generalization in DNN-Based Edge Computing Devices for Health Monitoring.

IEEE Trans Biomed Circuits Syst. 2025 Feb;19(1):5-15. doi: 10.1109/TBCAS.2024.3418085. Epub 2025 Feb 11.

Sexual Harassment and Prevention Training

Management of urinary stones by experts in stone disease (ESD 2025).

Arch Ital Urol Androl. 2025 Jun 30;97(2):14085. doi: 10.4081/aiua.2025.14085.

The Black Book of Psychotropic Dosing and Monitoring.

Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.

RVDLAHA: An RISC-V DLA Hardware Architecture for On-Device Real-Time Seizure Detection and Personalization in Wearable Applications.

IEEE Trans Biomed Circuits Syst. 2025 Feb;19(1):40-54. doi: 10.1109/TBCAS.2024.3442250. Epub 2025 Feb 11.

Home treatment for mental health problems: a systematic review.

Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.

Health professionals' experience of teamwork education in acute hospital settings: a systematic review of qualitative literature.

JBI Database System Rev Implement Rep. 2016 Apr;14(4):96-137. doi: 10.11124/JBISRIR-2016-1843.

High-Performance Method and Architecture for Attention Computation in DNN Inference.

作者信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献