Suppr超能文献

基于深度迁移学习的梅尔频谱图像鸟类分类

Deep transfer learning-based bird species classification using mel spectrogram images.

机构信息

Department of Computer Science and Engineering, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj, Bangladesh.

Department of Computer Science and Engineering, Khulna University of Engineering and Technology, Khulna, Bangladesh.

出版信息

PLoS One. 2024 Aug 12;19(8):e0305708. doi: 10.1371/journal.pone.0305708. eCollection 2024.

Abstract

The classification of bird species is of significant importance in the field of ornithology, as it plays an important role in assessing and monitoring environmental dynamics, including habitat modifications, migratory behaviors, levels of pollution, and disease occurrences. Traditional methods of bird classification, such as visual identification, were time-intensive and required a high level of expertise. However, audio-based bird species classification is a promising approach that can be used to automate bird species identification. This study aims to establish an audio-based bird species classification system for 264 Eastern African bird species employing modified deep transfer learning. In particular, the pre-trained EfficientNet technique was utilized for the investigation. The study adapts the fine-tune model to learn the pertinent patterns from mel spectrogram images specific to this bird species classification task. The fine-tuned EfficientNet model combined with a type of Recurrent Neural Networks (RNNs) namely Gated Recurrent Unit (GRU) and Long short-term memory (LSTM). RNNs are employed to capture the temporal dependencies in audio signals, thereby enhancing bird species classification accuracy. The dataset utilized in this work contains nearly 17,000 bird sound recordings across a diverse range of species. The experiment was conducted with several combinations of EfficientNet and RNNs, and EfficientNet-B7 with GRU surpasses other experimental models with an accuracy of 84.03% and a macro-average precision score of 0.8342.

摘要

鸟类物种的分类在鸟类学领域具有重要意义,因为它在评估和监测环境动态方面发挥着重要作用,包括栖息地的改变、迁徙行为、污染水平和疾病发生。传统的鸟类分类方法,如视觉识别,既费时又需要高度的专业知识。然而,基于音频的鸟类物种分类是一种很有前途的方法,可以用于自动化鸟类物种识别。本研究旨在建立一个基于音频的 264 种东非鸟类物种分类系统,采用改进的深度迁移学习。特别是,使用了经过预训练的 EfficientNet 技术进行了调查。该研究调整了微调模型,以从特定于这种鸟类分类任务的梅尔频谱图图像中学习相关模式。微调的 EfficientNet 模型与一种递归神经网络(RNN)即门控递归单元(GRU)和长短期记忆(LSTM)相结合。RNN 用于捕获音频信号中的时间依赖性,从而提高鸟类物种分类的准确性。本工作中使用的数据集包含了近 17000 种鸟类声音记录,涵盖了多种物种。实验中使用了 EfficientNet 和 RNN 的几种组合,EfficientNet-B7 和 GRU 的组合精度达到 84.03%,宏平均精度得分为 0.8342,超过了其他实验模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ff6/11318847/b86a5458835a/pone.0305708.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验