Suppr超能文献

用于发声聚类的深度音频嵌入。

Deep audio embeddings for vocalisation clustering.

机构信息

Université de Toulon, Aix Marseille Univ, CNRS, LIS, Toulon, France.

出版信息

PLoS One. 2023 Jul 10;18(7):e0283396. doi: 10.1371/journal.pone.0283396. eCollection 2023.

Abstract

The study of non-human animals' communication systems generally relies on the transcription of vocal sequences using a finite set of discrete units. This set is referred to as a vocal repertoire, which is specific to a species or a sub-group of a species. When conducted by human experts, the formal description of vocal repertoires can be laborious and/or biased. This motivates computerised assistance for this procedure, for which machine learning algorithms represent a good opportunity. Unsupervised clustering algorithms are suited for grouping close points together, provided a relevant representation. This paper therefore studies a new method for encoding vocalisations, allowing for automatic clustering to alleviate vocal repertoire characterisation. Borrowing from deep representation learning, we use a convolutional auto-encoder network to learn an abstract representation of vocalisations. We report on the quality of the learnt representation, as well as of state of the art methods, by quantifying their agreement with expert labelled vocalisation types from 8 datasets of other studies across 6 species (birds and marine mammals). With this benchmark, we demonstrate that using auto-encoders improves the relevance of vocalisation representation which serves repertoire characterisation using a very limited number of settings. We also publish a Python package for the bioacoustic community to train their own vocalisation auto-encoders or use a pretrained encoder to browse vocal repertoires and ease unit wise annotation.

摘要

对非人类动物交流系统的研究通常依赖于使用有限的离散单元对声音序列进行转录。这个集合被称为声音曲目,它是特定于一个物种或一个物种的亚群的。当由人类专家进行时,对声音曲目进行正式描述可能是费力的和/或有偏见的。这就促使人们寻求计算机辅助来完成这个过程,而机器学习算法就是一个很好的机会。无监督聚类算法适合将接近的点聚在一起,前提是有一个相关的表示。因此,本文研究了一种新的声音编码方法,允许自动聚类以减轻声音曲目特征描述的负担。借鉴深度学习,我们使用卷积自动编码器网络来学习声音的抽象表示。我们报告了所学到的表示的质量,以及最先进的方法,通过量化它们与来自 6 个物种(鸟类和海洋哺乳动物)的 8 个其他研究数据集的专家标记声音类型的一致性来实现。通过这个基准,我们证明了使用自动编码器可以提高声音表示的相关性,从而在使用非常有限的设置的情况下对曲目特征描述进行服务。我们还为生物声学社区发布了一个 Python 包,用于训练他们自己的声音自动编码器,或使用预训练的编码器来浏览声音曲目并轻松进行单元注释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e2f8/10332598/536dcda6d65b/pone.0283396.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验