重塑生物声学事件检测：利用带有转导推理和数据增强的少样本学习

Reshaping Bioacoustics Event Detection: Leveraging Few-Shot Learning (FSL) with Transductive Inference and Data Augmentation.

作者信息

Ijaz Nouman, Banoori Farhad, Koo Insoo

机构信息

Department of Electrical, Electronics and Computer Engineering, University of Ulsan, Ulsan 44610, Republic of Korea.

School of Electronics and Information Engineering, South China University of Technology, Guangzhou 510641, China.

出版信息

Bioengineering (Basel). 2024 Jul 5;11(7):685. doi: 10.3390/bioengineering11070685.

DOI:10.3390/bioengineering11070685

PMID:39061767

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11274013/

Abstract

Bioacoustic event detection is a demanding endeavor involving recognizing and classifying the sounds animals make in their natural habitats. Traditional supervised learning requires a large amount of labeled data, which are hard to come by in bioacoustics. This paper presents a few-shot learning (FSL) method incorporating transductive inference and data augmentation to address the issues of too few labeled events and small volumes of recordings. Here, transductive inference iteratively alters class prototypes and feature extractors to seize essential patterns, whereas data augmentation applies SpecAugment on Mel spectrogram features to augment training data. The proposed approach is evaluated by using the Detecting and Classifying Acoustic Scenes and Events (DCASE) 2022 and 2021 datasets. Extensive experimental results demonstrate that all components of the proposed method achieve significant F-score improvements of 27% and 10%, for the DCASE-2022 and DCASE-2021 datasets, respectively, compared to recent advanced approaches. Moreover, our method is helpful in FSL tasks because it effectively adapts to sounds from various animal species, recordings, and durations.

摘要

生物声学事件检测是一项具有挑战性的工作，涉及识别和分类动物在其自然栖息地发出的声音。传统的监督学习需要大量的标记数据，而在生物声学中很难获得这些数据。本文提出了一种结合转导推理和数据增强的少样本学习（FSL）方法，以解决标记事件过少和录音数量少的问题。在这里，转导推理迭代地改变类原型和特征提取器以捕捉基本模式，而数据增强则对梅尔频谱图特征应用SpecAugment来增强训练数据。通过使用检测与分类声学场景和事件（DCASE）2022和2021数据集对所提出的方法进行评估。大量实验结果表明，与最近的先进方法相比，所提出方法的所有组件在DCASE - 2022和DCASE - 2021数据集上分别实现了显著的F分数提升，提升幅度分别为27%和10%。此外，我们的方法在少样本学习任务中很有帮助，因为它能有效适应来自各种动物物种、录音和时长的声音。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

重塑生物声学事件检测：利用带有转导推理和数据增强的少样本学习

Reshaping Bioacoustics Event Detection: Leveraging Few-Shot Learning (FSL) with Transductive Inference and Data Augmentation.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

重塑生物声学事件检测：利用带有转导推理和数据增强的少样本学习

Reshaping Bioacoustics Event Detection: Leveraging Few-Shot Learning (FSL) with Transductive Inference and Data Augmentation.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献