Suppr超能文献

基于投票方案的卷积的高效硬件设计与实现。

Efficient Hardware Design and Implementation of the Voting Scheme-Based Convolution.

机构信息

Algoritmi Centre, University of Minho, 4800-058 Guimarães, Portugal.

Associação Laboratório Colaborativo em Transformação Digital-DTx Colab, 4800-058 Guimarães, Portugal.

出版信息

Sensors (Basel). 2022 Apr 12;22(8):2943. doi: 10.3390/s22082943.

Abstract

Due to a point cloud's sparse nature, a sparse convolution block design is necessary to deal with its particularities. Mechanisms adopted in computer vision have recently explored the advantages of data processing in more energy-efficient hardware, such as the FPGA, as a response to the need to run these algorithms on resource-constrained edge devices. However, implementing it in hardware has not been properly explored, resulting in a small number of studies aimed at analyzing the potential of sparse convolutions and their efficiency on resource-constrained hardware platforms. This article presents the design of a customizable hardware block for the voting convolution. We carried out an in-depth analysis to determine under which conditions the use of the voting scheme is justified instead of dense convolutions. The proposed hardware design achieves an energy consumption about 8.7 times lower than similar works in the literature by ignoring unnecessary arithmetic operations with null weights and leveraging data dependency. Access to data memory was also reduced to the minimum necessary, leading to improvements of around 55% in processing time. To evaluate both the performance and applicability of the proposed solution, the voting convolution was integrated into the well-known PointPillars model, where it achieves improvements between 23.05% and 80.44% without a significant effect on detection performance.

摘要

由于点云的稀疏性,需要设计稀疏卷积块来处理其特殊性。计算机视觉中采用的机制最近探索了在更节能的硬件(如 FPGA)中处理数据的优势,以满足在资源受限的边缘设备上运行这些算法的需求。然而,在硬件中实现这一点并没有得到很好的探索,导致很少有研究旨在分析稀疏卷积的潜力及其在资源受限的硬件平台上的效率。本文提出了一种可定制的投票卷积硬件模块的设计。我们进行了深入的分析,以确定在哪些情况下使用投票方案而不是密集卷积是合理的。所提出的硬件设计通过忽略具有零权重的不必要的算术运算并利用数据依赖性,实现了比文献中类似工作低约 8.7 倍的能耗。还将数据存储器的访问减少到最低限度,从而使处理时间提高了约 55%。为了评估所提出解决方案的性能和适用性,将投票卷积集成到著名的 PointPillars 模型中,在不显著影响检测性能的情况下,该模型的性能提高了 23.05%至 80.44%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4aa0/9027320/00b2fa33b2e8/sensors-22-02943-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验