Suppr超能文献

用于生物声学分类的深度度量学习:使用动态三元组损失克服训练数据稀缺问题。

Deep metric learning for bioacoustic classification: Overcoming training data scarcity using dynamic triplet loss.

作者信息

Thakur Anshul, Thapar Daksh, Rajan Padmanabhan, Nigam Aditya

机构信息

School of Computing and Electrical Engineering, IIT Mandi, Mandi, Himachal Pradesh-175005, India.

出版信息

J Acoust Soc Am. 2019 Jul;146(1):534. doi: 10.1121/1.5118245.

Abstract

Bioacoustic classification often suffers from the lack of labeled data. This hinders the effective utilization of state-of-the-art deep learning models in bioacoustics. To overcome this problem, the authors propose a deep metric learning-based framework that provides effective classification, even when only a small number of per-class training examples are available. The proposed framework utilizes a multiscale convolutional neural network and the proposed dynamic variant of the triplet loss to learn a transformation space where intra-class separation is minimized and inter-class separation is maximized by a dynamically increasing margin. The process of learning this transformation is known as deep metric learning. The triplet loss analyzes three examples (referred to as a triplet) at a time to perform deep metric learning. The number of possible triplets increases cubically with the dataset size, making triplet loss more suitable than the cross-entropy loss in data-scarce conditions. Experiments on three different publicly available datasets show that the proposed framework performs better than existing bioacoustic classification methods. Experimental results also demonstrate the superiority of dynamic triplet loss over cross-entropy loss in data-scarce conditions. Furthermore, unlike existing bioacoustic classification methods, the proposed framework has been extended to provide open-set classification.

摘要

生物声学分类常常因缺乏标注数据而受到影响。这阻碍了最先进的深度学习模型在生物声学中的有效应用。为克服这一问题,作者提出了一种基于深度度量学习的框架,即使在每类仅有少量训练示例的情况下,该框架也能提供有效的分类。所提出的框架利用多尺度卷积神经网络以及所提出的三元组损失的动态变体来学习一个变换空间,在这个空间中,通过动态增加边界,类内间距最小化,类间间距最大化。学习这种变换的过程称为深度度量学习。三元组损失一次分析三个示例(称为一个三元组)来执行深度度量学习。可能的三元组数量随数据集大小呈立方增长,这使得三元组损失在数据稀缺的情况下比交叉熵损失更适用。在三个不同的公开可用数据集上进行的实验表明,所提出的框架比现有的生物声学分类方法表现更好。实验结果还证明了在数据稀缺的情况下,动态三元组损失优于交叉熵损失。此外,与现有的生物声学分类方法不同,所提出的框架已扩展为提供开放集分类。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验