Suppr超能文献

用于多传感器人工智能系统的硬件辅助低延迟神经网络处理器虚拟化方法

Hardware-Assisted Low-Latency NPU Virtualization Method for Multi-Sensor AI Systems.

作者信息

Jean Jong-Hwan, Kim Dong-Sun

机构信息

Department of Semiconductor Systems Engineering, Sejong University, Seoul 05006, Republic of Korea.

出版信息

Sensors (Basel). 2024 Dec 15;24(24):8012. doi: 10.3390/s24248012.

Abstract

Recently, AI systems such as autonomous driving and smart homes have become integral to daily life. Intelligent multi-sensors, once limited to single data types, now process complex text and image data, demanding faster and more accurate processing. While integrating NPUs and sensors has improved processing speed and accuracy, challenges like low resource utilization and long memory latency remain. This study proposes a method to reduce processing time and improve resource utilization by virtualizing NPUs to simultaneously handle multiple deep-learning models, leveraging a hardware scheduler and data prefetching techniques. Experiments with 30,000 SA resources showed that the hardware scheduler reduced memory cycles by over 10% across all models, with reductions of 30% for NCF and 70% for DLRM. The hardware scheduler effectively minimized memory latency and idle NPU resources in resource-constrained environments with frequent context switching. This approach is particularly valuable for real-time applications like autonomous driving, enabling smooth transitions between tasks such as object detection and route planning. It also enhances multitasking in smart homes by reducing latency when managing diverse data streams. The proposed system is well suited for resource-constrained environments that demand efficient multitasking and low-latency processing.

摘要

最近,自动驾驶和智能家居等人工智能系统已成为日常生活中不可或缺的一部分。智能多传感器曾经仅限于单一数据类型,现在能够处理复杂的文本和图像数据,这就需要更快、更准确的处理。虽然集成神经网络处理器(NPUs)和传感器提高了处理速度和准确性,但诸如资源利用率低和内存延迟长等挑战仍然存在。本研究提出了一种方法,通过虚拟化神经网络处理器以同时处理多个深度学习模型,并利用硬件调度器和数据预取技术,来减少处理时间并提高资源利用率。对30000个SA资源进行的实验表明,硬件调度器在所有模型中均将内存周期减少了10%以上,其中神经协同过滤(NCF)模型减少了30%,深度线性推荐模型(DLRM)减少了70%。在频繁进行上下文切换的资源受限环境中,硬件调度器有效地将内存延迟和空闲神经网络处理器资源降至最低。这种方法对于自动驾驶等实时应用尤为重要,能够在目标检测和路线规划等任务之间实现平滑过渡。它还通过减少管理各种数据流时的延迟,增强了智能家居中的多任务处理能力。所提出的系统非常适合需要高效多任务处理和低延迟处理的资源受限环境。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c63e/11679731/3b598183d887/sensors-24-08012-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验