Scripps Institution of Oceanography, University of California San Diego, La Jolla, California, United States of America.
PLoS Comput Biol. 2021 Dec 3;17(12):e1009613. doi: 10.1371/journal.pcbi.1009613. eCollection 2021 Dec.
Machine learning algorithms, including recent advances in deep learning, are promising for tools for detection and classification of broadband high frequency signals in passive acoustic recordings. However, these methods are generally data-hungry and progress has been limited by challenges related to the lack of labeled datasets adequate for training and testing. Large quantities of known and as yet unidentified broadband signal types mingle in marine recordings, with variability introduced by acoustic propagation, source depths and orientations, and interacting signals. Manual classification of these datasets is unmanageable without an in-depth knowledge of the acoustic context of each recording location. A signal classification pipeline is presented which combines unsupervised and supervised learning phases with opportunities for expert oversight to label signals of interest. The method is illustrated with a case study using unsupervised clustering to identify five toothed whale echolocation click types and two anthropogenic signal categories. These categories are used to train a deep network to classify detected signals in either averaged time bins or as individual detections, in two independent datasets. Bin-level classification achieved higher overall precision (>99%) than click-level classification. However, click-level classification had the advantage of providing a label for every signal, and achieved higher overall recall, with overall precision from 92 to 94%. The results suggest that unsupervised learning is a viable solution for efficiently generating the large, representative training sets needed for applications of deep learning in passive acoustics.
机器学习算法,包括深度学习的最新进展,有望成为被动声学记录中宽带高频信号检测和分类的工具。然而,这些方法通常需要大量数据,并且由于缺乏足够的用于训练和测试的标记数据集,进展受到限制。大量已知和尚未识别的宽带信号类型与海洋记录中的声传播、声源深度和方向以及相互作用的信号混合在一起。如果没有对每个记录位置的声学背景的深入了解,手动对这些数据集进行分类是无法管理的。提出了一种信号分类管道,它将无监督和监督学习阶段结合在一起,并为专家监督提供机会来标记感兴趣的信号。该方法通过一个案例研究进行了说明,该研究使用无监督聚类来识别五种齿鲸声纳点击类型和两种人为信号类别。这些类别用于在两个独立的数据集中训练深度网络,以分类检测到的信号在平均时间窗中或作为单个检测。在时间窗水平上的分类精度(>99%)高于点击水平上的分类精度。然而,点击水平上的分类具有为每个信号提供标签的优势,并且具有更高的整体召回率,总体精度从 92%到 94%不等。结果表明,无监督学习是一种有效的方法,可以有效地生成深度学习在被动声学中的应用所需的大型、代表性训练集。