Fang Zheng, Yin Bo, Du Zehua, Huang Xianqing
College of Information Science and Engineering, Ocean University of China, Qingdao, China.
Pilot National Laboratory for Marine Science and Technology, Qingdao, China.
Sci Rep. 2022 Apr 22;12(1):6599. doi: 10.1038/s41598-022-10382-x.
Recently, with the construction of smart city, the research on environmental sound classification (ESC) has attracted the attention of academia and industry. The development of convolutional neural network (CNN) makes the accuracy of ESC reach a higher level, but the accuracy improvement brought by CNN is often accompanied by the deepening of network layers, which leads to the rapid growth of parameters and floating-point operations (FLOPs). Therefore, it is difficult to transplant CNN model to embedded devices, and the classification speed is also difficult to accept. In order to reduce the hardware requirements of running CNN and improve the speed of ESC, this paper proposes a resource adaptive convolutional neural network (RACNN). RACNN uses a novel resource adaptive convolutional (RAC) module, which can generate the same number of feature maps as conventional convolution operations more cheaply, and extract the time and frequency features of audio efficiently. The RAC block based on the RAC module is designed to build the lightweight RACNN model, and the RAC module can also be used to upgrade the existing CNN model. Experiments based on public datasets show that RACNN achieves higher performance than the state-of-the-art methods with lower computational complexity.
近年来,随着智慧城市的建设,环境声音分类(ESC)研究受到学术界和工业界的关注。卷积神经网络(CNN)的发展使ESC的准确率达到了更高水平,但CNN带来的准确率提升往往伴随着网络层数的加深,这导致参数和浮点运算(FLOP)快速增长。因此,难以将CNN模型移植到嵌入式设备上,分类速度也难以令人接受。为了降低运行CNN的硬件要求并提高ESC的速度,本文提出了一种资源自适应卷积神经网络(RACNN)。RACNN使用了一种新颖的资源自适应卷积(RAC)模块,该模块能够以更低的成本生成与传统卷积操作相同数量的特征图,并有效地提取音频的时间和频率特征。基于RAC模块设计的RAC块用于构建轻量级的RACNN模型,RAC模块还可用于升级现有的CNN模型。基于公共数据集的实验表明,RACNN在较低的计算复杂度下比现有方法具有更高的性能。