Paranayapa Thivindu, Ranasinghe Piumini, Ranmal Dakshina, Meedeniya Dulani, Perera Charith
Department of Computer Science & Engineering, University of Moratuwa, Moratuwa 10400, Sri Lanka.
School of Computer Science and Informatics, Cardiff University, Cardiff CF24 3AA, UK.
Sensors (Basel). 2024 Feb 9;24(4):1149. doi: 10.3390/s24041149.
Deep-learning models play a significant role in modern software solutions, with the capabilities of handling complex tasks, improving accuracy, automating processes, and adapting to diverse domains, eventually contributing to advancements in various industries. This study provides a comparative study on deep-learning techniques that can also be deployed on resource-constrained edge devices. As a novel contribution, we analyze the performance of seven Convolutional Neural Network models in the context of data augmentation, feature extraction, and model compression using acoustic data. The results show that the best performers can achieve an optimal trade-off between model accuracy and size when compressed with weight and filter pruning followed by 8-bit quantization. In adherence to the study workflow utilizing the forest sound dataset, MobileNet-v3-small and ACDNet achieved accuracies of 87.95% and 85.64%, respectively, while maintaining compact sizes of 243 KB and 484 KB, respectively. Henceforth, this study concludes that CNNs can be optimized and compressed to be deployed in resource-constrained edge devices for classifying forest environment sounds.
深度学习模型在现代软件解决方案中发挥着重要作用,具有处理复杂任务、提高准确性、自动化流程以及适应不同领域的能力,最终推动各个行业的发展。本研究对深度学习技术进行了比较研究,这些技术也可部署在资源受限的边缘设备上。作为一项新颖的贡献,我们在使用声学数据进行数据增强、特征提取和模型压缩的背景下,分析了七种卷积神经网络模型的性能。结果表明,性能最佳的模型在采用权重和滤波器剪枝然后进行8位量化压缩时,能够在模型准确性和大小之间实现最佳平衡。按照利用森林声音数据集的研究工作流程,MobileNet-v3-small和ACDNet分别实现了87.95%和85.64%的准确率,同时分别保持了243 KB和484 KB的紧凑大小。因此,本研究得出结论,卷积神经网络可以进行优化和压缩,以便部署在资源受限的边缘设备上用于森林环境声音分类。