Suppr超能文献

像素处理器阵列中基于动态模型交换的片上二值化卷积神经网络推理

On-sensor binarized CNN inference with dynamic model swapping in pixel processor arrays.

作者信息

Liu Yanan, Bose Laurie, Fan Rui, Dudek Piotr, Mayol-Cuevas Walterio

机构信息

Bristol Robotics Laboratory, Faculty of Engineering, University of Bristol, Bristol, United Kingdom.

School of Microelectronics, Shanghai University, Shanghai, China.

出版信息

Front Neurosci. 2022 Aug 15;16:909448. doi: 10.3389/fnins.2022.909448. eCollection 2022.

Abstract

Many types of Convolutional Neural Network (CNN) models and training methods have been proposed in recent years aiming to provide efficiency for embedded and edge devices with limited computation and memory resources. The wide variety of architectures makes this a complex task that has to balance generality with efficiency. Among the most interesting camera-sensor architectures are Pixel Processor Arrays (PPAs). This study presents two methods that are useful for embedded CNNs in general but particularly suitable for PPAs. The first is for training purely binarized CNNs, the second is for deploying larger models with a model swapping paradigm that loads model components dynamically. Specifically, this study trains and implements networks with batch normalization and adaptive threshold for binary activations. Then, we convert batch normalization and binary activations into a bias matrix which can be parallelly implemented by an add/sub operation. For dynamic model swapping, we propose to decompose applications that are beyond the capacity of a PPA into sub-tasks that can be solved by tree networks that can be loaded dynamically as needed. We demonstrate our approaches to various tasks including classification, localization, and coarse segmentation on a highly resource constrained PPA sensor-processor.

摘要

近年来,人们提出了多种类型的卷积神经网络(CNN)模型和训练方法,旨在为计算和内存资源有限的嵌入式及边缘设备提供高效性。各种各样的架构使得这成为一项复杂的任务,必须在通用性和效率之间取得平衡。最引人关注的相机传感器架构之一是像素处理器阵列(PPA)。本研究提出了两种方法,它们一般对嵌入式CNN很有用,但特别适用于PPA。第一种方法用于训练纯二值化的CNN,第二种方法用于通过动态加载模型组件的模型交换范式来部署更大的模型。具体而言,本研究训练并实现了具有批量归一化和用于二值激活的自适应阈值的网络。然后,我们将批量归一化和二值激活转换为一个偏差矩阵,该矩阵可以通过加法/减法操作并行实现。对于动态模型交换,我们建议将超出PPA能力的应用分解为子任务,这些子任务可以由树状网络解决,树状网络可以根据需要动态加载。我们在资源高度受限的PPA传感器处理器上展示了我们针对各种任务(包括分类、定位和粗分割)的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a35/9421154/5a81c98b882b/fnins-16-909448-g0002.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验