Suppr超能文献

从平面嵌入中交互式提取多样的语音单元,无需事先进行声音分割。

Interactive extraction of diverse vocal units from a planar embedding without the need for prior sound segmentation.

作者信息

Lorenz Corinna, Hao Xinyu, Tomka Tomas, Rüttimann Linus, Hahnloser Richard H R

机构信息

Institute of Neuroinformatics and Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland.

Université Paris-Saclay, CNRS, Institut des Neurosciences Paris-Saclay, Saclay, France.

出版信息

Front Bioinform. 2023 Jan 13;2:966066. doi: 10.3389/fbinf.2022.966066. eCollection 2022.

Abstract

Annotating and proofreading data sets of complex natural behaviors such as vocalizations are tedious tasks because instances of a given behavior need to be correctly segmented from background noise and must be classified with minimal false positive error rate. Low-dimensional embeddings have proven very useful for this task because they can provide a visual overview of a data set in which distinct behaviors appear in different clusters. However, low-dimensional embeddings introduce errors because they fail to preserve distances; and embeddings represent only objects of fixed dimensionality, which conflicts with vocalizations that have variable dimensions stemming from their variable durations. To mitigate these issues, we introduce a semi-supervised, analytical method for simultaneous segmentation and clustering of vocalizations. We define a given vocalization type by specifying pairs of high-density regions in the embedding plane of sound spectrograms, one region associated with vocalization onsets and the other with offsets. We demonstrate our two-neighborhood (2N) extraction method on the task of clustering adult zebra finch vocalizations embedded with UMAP. We show that 2N extraction allows the identification of short and long vocal renditions from continuous data streams without initially committing to a particular segmentation of the data. Also, 2N extraction achieves much lower false positive error rate than comparable approaches based on a single defining region. Along with our method, we present a graphical user interface (GUI) for visualizing and annotating data.

摘要

注释和校对诸如发声等复杂自然行为的数据集是繁琐的任务,因为给定行为的实例需要从背景噪声中正确分割出来,并且必须以最小的误报率进行分类。低维嵌入已被证明对这项任务非常有用,因为它们可以提供数据集的可视化概述,其中不同的行为出现在不同的簇中。然而,低维嵌入会引入误差,因为它们无法保留距离;而且嵌入仅表示固定维度的对象,这与由于持续时间可变而具有可变维度的发声相冲突。为了缓解这些问题,我们引入了一种用于发声的同时分割和聚类的半监督分析方法。我们通过在声谱图的嵌入平面中指定高密度区域对来定义给定的发声类型,一个区域与发声起始相关,另一个与发声结束相关。我们在对嵌入UMAP的成年斑胸草雀发声进行聚类的任务上展示了我们的双邻域(2N)提取方法。我们表明,2N提取允许从连续数据流中识别短和长的发声表现,而无需最初确定数据的特定分割。此外,2N提取的误报率比基于单个定义区域的可比方法低得多。连同我们的方法,我们还展示了一个用于可视化和注释数据的图形用户界面(GUI)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a1c5/9880044/6ab7239510f6/fbinf-02-966066-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验