Department of Earth and Environmental Science, Faculty of Environment, Science and Economy, University of Exeter, Penryn, TR10 9FE, United Kingdom.
Centre for Ecology and Conservation, Faculty of Environment, Science and Economy, University of Exeter, Penryn, TR10 9FE, United Kingdom.
J Acoust Soc Am. 2024 Sep 1;156(3):1623-1632. doi: 10.1121/10.0028268.
Passive acoustic monitoring is a promising tool for monitoring at-risk populations of vocal species, yet, extracting relevant information from large acoustic datasets can be time-consuming, creating a bottleneck at the point of analysis. To address this, an open-source framework for deep learning in bioacoustics to automatically detect Bornean white-bearded gibbon (Hylobates albibarbis) "great call" vocalizations in a long-term acoustic dataset from a rainforest location in Borneo is adapted. The steps involved in developing this solution are described, including collecting audio recordings, developing training and testing datasets, training neural network models, and evaluating model performance. The best model performed at a satisfactory level (F score = 0.87), identifying 98% of the highest-quality calls from 90 h of manually annotated audio recordings and greatly reduced analysis times when compared to a human observer. No significant difference was found in the temporal distribution of great call detections between the manual annotations and the model's output. Future work should seek to apply this model to long-term acoustic datasets to understand spatiotemporal variations in H. albibarbis' calling activity. Overall, a roadmap is presented for applying deep learning to identify the vocalizations of species of interest, which can be adapted for monitoring other endangered vocalizing species.
被动声学监测是监测濒危发声物种的一种很有前途的工具,但从大型声学数据集中提取相关信息可能很耗时,这在分析阶段造成了瓶颈。为了解决这个问题,适应了一种用于生物声学深度学习的开源框架,以自动检测来自婆罗洲热带雨林位置的长期声学数据集中的婆罗洲白须长臂猿(Hylobates albibarbis)“大叫声”。描述了开发此解决方案所涉及的步骤,包括收集音频记录、开发训练和测试数据集、训练神经网络模型以及评估模型性能。表现最好的模型达到了令人满意的水平(F 分数= 0.87),从 90 小时的手动标注音频记录中识别出 98%的高质量叫声,并且与人工观察者相比大大减少了分析时间。在手动注释和模型输出之间,大叫声检测的时间分布没有发现显著差异。未来的工作应该寻求将该模型应用于长期声学数据集,以了解 H. albibarbis 叫声活动的时空变化。总的来说,提出了应用深度学习识别感兴趣物种叫声的路线图,该路线图可以适应监测其他濒危发声物种。