• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BioCPPNet:基于深度神经网络的生物声学源分离

BioCPPNet: automatic bioacoustic source separation with deep neural networks.

机构信息

Earth Species Project, Berkeley, CA, 94709, USA.

出版信息

Sci Rep. 2021 Dec 6;11(1):23502. doi: 10.1038/s41598-021-02790-2.

DOI:10.1038/s41598-021-02790-2
PMID:34873197
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8648737/
Abstract

We introduce the Bioacoustic Cocktail Party Problem Network (BioCPPNet), a lightweight, modular, and robust U-Net-based machine learning architecture optimized for bioacoustic source separation across diverse biological taxa. Employing learnable or handcrafted encoders, BioCPPNet operates directly on the raw acoustic mixture waveform containing overlapping vocalizations and separates the input waveform into estimates corresponding to the sources in the mixture. Predictions are compared to the reference ground truth waveforms by searching over the space of (output, target) source order permutations, and we train using an objective function motivated by perceptual audio quality. We apply BioCPPNet to several species with unique vocal behavior, including macaques, bottlenose dolphins, and Egyptian fruit bats, and we evaluate reconstruction quality of separated waveforms using the scale-invariant signal-to-distortion ratio (SI-SDR) and downstream identity classification accuracy. We consider mixtures with two or three concurrent conspecific vocalizers, and we examine separation performance in open and closed speaker scenarios. To our knowledge, this paper redefines the state-of-the-art in end-to-end single-channel bioacoustic source separation in a permutation-invariant regime across a heterogeneous set of non-human species. This study serves as a major step toward the deployment of bioacoustic source separation systems for processing substantial volumes of previously unusable data containing overlapping bioacoustic signals.

摘要

我们介绍了 Bioacoustic Cocktail Party Problem Network(BioCPPNet),这是一种轻量级、模块化且稳健的基于 U-Net 的机器学习架构,针对不同生物分类群的生物声学源分离进行了优化。BioCPPNet 采用可学习或手工制作的编码器,直接对包含重叠发声的原始声混合波形进行操作,并将输入波形分离成与混合中的源相对应的估计值。通过在(输出、目标)源顺序排列的空间上进行搜索,将预测结果与参考地面真实波形进行比较,我们使用受感知音频质量启发的目标函数进行训练。我们将 BioCPPNet 应用于具有独特发声行为的多个物种,包括猕猴、宽吻海豚和埃及果蝠,并使用无标度信号失真比(SI-SDR)和下游身份分类准确性来评估分离波形的重建质量。我们考虑了具有两个或三个并发同种发声者的混合物,并在开放和封闭扬声器场景中检查了分离性能。据我们所知,本文在非人类物种的异构集合中,以不变的排列方式重新定义了端到端单声道生物声学源分离的最新技术水平。这项研究是朝着部署生物声学源分离系统以处理以前无法使用的包含重叠生物声学信号的大量数据迈出的重要一步。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/655a/8648737/be289919df9b/41598_2021_2790_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/655a/8648737/51d27b1e4e53/41598_2021_2790_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/655a/8648737/8f6b82242454/41598_2021_2790_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/655a/8648737/be289919df9b/41598_2021_2790_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/655a/8648737/51d27b1e4e53/41598_2021_2790_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/655a/8648737/8f6b82242454/41598_2021_2790_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/655a/8648737/be289919df9b/41598_2021_2790_Fig3_HTML.jpg

相似文献

1
BioCPPNet: automatic bioacoustic source separation with deep neural networks.BioCPPNet:基于深度神经网络的生物声学源分离
Sci Rep. 2021 Dec 6;11(1):23502. doi: 10.1038/s41598-021-02790-2.
2
Bioacoustic classification of avian calls from raw sound waveforms with an open-source deep learning architecture.基于开源深度学习架构的原始声音波形的鸟类叫声的生物声学分类。
Sci Rep. 2021 Aug 3;11(1):15733. doi: 10.1038/s41598-021-95076-6.
3
Robust sound event detection in bioacoustic sensor networks.生物声传感器网络中的鲁棒声音事件检测。
PLoS One. 2019 Oct 24;14(10):e0214168. doi: 10.1371/journal.pone.0214168. eCollection 2019.
4
Semiautomated generation of species-specific training data from large, unlabeled acoustic datasets for deep supervised birdsong isolation.从大型未标记声学数据集自动生成特定物种训练数据,用于深度监督鸟鸣分离。
PeerJ. 2024 Sep 23;12:e17854. doi: 10.7717/peerj.17854. eCollection 2024.
5
Separation of overlapping sources in bioacoustic mixtures.生物声学混合信号中重叠声源的分离。
J Acoust Soc Am. 2020 Mar;147(3):1688. doi: 10.1121/10.0000932.
6
Open Set Bioacoustic Signal Classification based on Class Anchor Clustering with Closed Set Unknown Bioacoustic Signals.基于带有闭集未知生物声学信号的类锚聚类的开集生物声学信号分类。
Annu Int Conf IEEE Eng Med Biol Soc. 2023 Jul;2023:1-4. doi: 10.1109/EMBC40787.2023.10340108.
7
Discrimination of acoustically similar conspecific and heterospecific vocalizations by black-capped chickadees (Poecile atricapillus).黑头山雀(Poecile atricapillus)对声学上相似的同种和异种鸣声的辨别。
Anim Cogn. 2017 Jul;20(4):639-654. doi: 10.1007/s10071-017-1087-5. Epub 2017 Apr 9.
8
Automated categorization of bioacoustic signals: avoiding perceptual pitfalls.生物声学信号的自动分类:避免感知陷阱。
J Acoust Soc Am. 2006 Jan;119(1):645-53. doi: 10.1121/1.2139067.
9
Deep audio embeddings for vocalisation clustering.用于发声聚类的深度音频嵌入。
PLoS One. 2023 Jul 10;18(7):e0283396. doi: 10.1371/journal.pone.0283396. eCollection 2023.
10
ANIMAL-SPOT enables animal-independent signal detection and classification using deep learning.ANIMAL-SPOT 利用深度学习实现了动物独立的信号检测和分类。
Sci Rep. 2022 Dec 19;12(1):21966. doi: 10.1038/s41598-022-26429-y.

引用本文的文献

1
Enhancing Situational Awareness with VAS-Compass Net for the Recognition of Directional Vehicle Alert Sounds.利用 VAS-Compass Net 增强情境感知,以识别定向车辆警报声音。
Sensors (Basel). 2024 Oct 24;24(21):6841. doi: 10.3390/s24216841.
2
Automatic detection for bioacoustic research: a practical guide from and for biologists and computer scientists.生物声学研究中的自动检测:面向生物学家和计算机科学家的实用指南
Biol Rev Camb Philos Soc. 2025 Apr;100(2):620-646. doi: 10.1111/brv.13155. Epub 2024 Oct 17.
3
Semiautomated generation of species-specific training data from large, unlabeled acoustic datasets for deep supervised birdsong isolation.

本文引用的文献

1
Separating overlapping bat calls with a bi-directional long short-term memory network.使用双向长短时记忆网络分离重叠的蝙蝠叫声。
Integr Zool. 2022 Sep;17(5):741-751. doi: 10.1111/1749-4877.12549. Epub 2021 May 30.
2
NIPS4Bplus: a richly annotated birdsong audio dataset.NIPS4Bplus:一个带有丰富注释的鸟鸣音频数据集。
PeerJ Comput Sci. 2019 Oct 7;5:e223. doi: 10.7717/peerj-cs.223. eCollection 2019.
3
Performance of a deep neural network at detecting North Atlantic right whale upcalls.深度神经网络在检测北大西洋露脊鲸叫声方面的性能。
从大型未标记声学数据集自动生成特定物种训练数据,用于深度监督鸟鸣分离。
PeerJ. 2024 Sep 23;12:e17854. doi: 10.7717/peerj.17854. eCollection 2024.
4
Elephants and algorithms: a review of the current and future role of AI in elephant monitoring.大象与算法:人工智能在大象监测中的当前和未来作用综述。
J R Soc Interface. 2023 Nov;20(208):20230367. doi: 10.1098/rsif.2023.0367. Epub 2023 Nov 15.
5
A dataset for benchmarking Neotropical anuran calls identification in passive acoustic monitoring.用于被动声学监测中鉴定新热带蛙类鸣声的基准数据集。
Sci Data. 2023 Nov 6;10(1):771. doi: 10.1038/s41597-023-02666-2.
6
Sparse Component Analysis (SCA) Based on Adaptive Time-Frequency Thresholding for Underdetermined Blind Source Separation (UBSS).基于自适应时频阈值的欠定盲源分离稀疏成分分析。
Sensors (Basel). 2023 Feb 11;23(4):2060. doi: 10.3390/s23042060.
J Acoust Soc Am. 2020 Apr;147(4):2636. doi: 10.1121/10.0001132.
4
Separation of overlapping sources in bioacoustic mixtures.生物声学混合信号中重叠声源的分离。
J Acoust Soc Am. 2020 Mar;147(3):1688. doi: 10.1121/10.0000932.
5
Detection and identification of manatee individual vocalizations in Panamanian wetlands using spectrogram clustering.使用声谱图聚类技术检测和识别巴拿马湿地中的海牛个体叫声。
J Acoust Soc Am. 2019 Sep;146(3):1745. doi: 10.1121/1.5126504.
6
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation.卷积时域音频分离网络(Conv-TasNet):超越理想时频幅度掩蔽的语音分离方法
IEEE/ACM Trans Audio Speech Lang Process. 2019 Aug;27(8):1256-1266. doi: 10.1109/TASLP.2019.2915167. Epub 2019 May 6.
7
Deep Machine Learning Techniques for the Detection and Classification of Sperm Whale Bioacoustics.深度学习技术在抹香鲸生物声学检测与分类中的应用。
Sci Rep. 2019 Aug 29;9(1):12588. doi: 10.1038/s41598-019-48909-4.
8
Supervised Speech Separation Based on Deep Learning: An Overview.基于深度学习的监督语音分离:综述
IEEE/ACM Trans Audio Speech Lang Process. 2018 Oct;26(10):1702-1726. doi: 10.1109/TASLP.2018.2842159. Epub 2018 May 30.
9
An annotated dataset of Egyptian fruit bat vocalizations across varying contexts and during vocal ontogeny.具有不同背景和发声发育阶段的埃及果蝠发声的标注数据集。
Sci Data. 2017 Oct 3;4:170143. doi: 10.1038/sdata.2017.143.
10
A blind source separation approach for humpback whale song separation.一种用于座头鲸歌声分离的盲源分离方法。
J Acoust Soc Am. 2017 Apr;141(4):2705. doi: 10.1121/1.4980856.