Centre for Marine Science and Technology, Curtin University, Perth, Western Australia 6845, Australia.
K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY 14853-0001, USA.
Philos Trans R Soc Lond B Biol Sci. 2024 Jun 24;379(1904):20230444. doi: 10.1098/rstb.2023.0444. Epub 2024 May 6.
Passive acoustic monitoring (PAM) is a powerful tool for studying ecosystems. However, its effective application in tropical environments, particularly for insects, poses distinct challenges. Neotropical katydids produce complex species-specific calls, spanning mere milliseconds to seconds and spread across broad audible and ultrasonic frequencies. However, subtle differences in inter-pulse intervals or central frequencies are often the only discriminatory traits. These extremities, coupled with low source levels and susceptibility to masking by ambient noise, challenge species identification in PAM recordings. This study aimed to develop a deep learning-based solution to automate the recognition of 31 katydid species of interest in a biodiverse Panamanian forest with over 80 katydid species. Besides the innate challenges, our efforts were also encumbered by a limited and imbalanced initial training dataset comprising domain-mismatched recordings. To overcome these, we applied rigorous data engineering, improving input variance through controlled playback re-recordings and by employing physics-based data augmentation techniques, and tuning signal-processing, model and training parameters to produce a custom well-fit solution. Methods developed here are incorporated into Koogu, an open-source Python-based toolbox for developing deep learning-based bioacoustic analysis solutions. The parametric implementations offer a valuable resource, enhancing the capabilities of PAM for studying insects in tropical ecosystems. This article is part of the theme issue 'Towards a toolkit for global insect biodiversity monitoring'.
被动声学监测(PAM)是研究生态系统的有力工具。然而,其在热带环境中的有效应用,特别是对昆虫而言,存在明显的挑战。新热带蟋蟀产生复杂的物种特异性叫声,持续时间仅为数毫秒到数秒,并且分布在广泛的可听和超声频率范围内。然而,在脉冲间隔或中心频率之间的细微差异通常是唯一的鉴别特征。这些极端情况,加上低源水平和易受环境噪声掩蔽的影响,使得在 PAM 记录中进行物种识别具有挑战性。本研究旨在开发一种基于深度学习的解决方案,以自动识别在具有 80 多种蟋蟀物种的巴拿马生物多样性森林中 31 种感兴趣的蟋蟀物种。除了固有的挑战之外,我们的努力还受到有限且不平衡的初始训练数据集的阻碍,这些数据集与领域不匹配的录音相匹配。为了克服这些问题,我们应用了严格的数据工程,通过受控回放重录和采用基于物理的数据增强技术来提高输入方差,并调整信号处理、模型和训练参数以生成定制的解决方案。这里开发的方法被纳入 Koogu 中,这是一个基于 Python 的开源工具包,用于开发基于深度学习的生物声学分析解决方案。参数化实现提供了有价值的资源,增强了 PAM 在研究热带生态系统中昆虫的能力。本文是“全球昆虫生物多样性监测工具包”主题问题的一部分。