基于多维特征提取和数据增强的海洋哺乳动物声音并行分类模型。

A Parallel Classification Model for Marine Mammal Sounds Based on Multi-Dimensional Feature Extraction and Data Augmentation.

机构信息

College of Electronics and Information, Hangzhou Dianzi University, Hangzhou 310018, China.

College of Electrical Engineering, Zhejiang University of Water Resources and Electric Power, Hangzhou 310018, China.

出版信息

Sensors (Basel). 2022 Sep 30;22(19):7443. doi: 10.3390/s22197443.

DOI:10.3390/s22197443

PMID:36236544

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9572586/

Abstract

Due to the poor visibility of the deep-sea environment, acoustic signals are often collected and analyzed to explore the behavior of marine species. With the progress of underwater signal-acquisition technology, the amount of acoustic data obtained from the ocean has exceeded the limit that human can process manually, so designing efficient marine-mammal classification algorithms has become a research hotspot. In this paper, we design a classification model based on a multi-channel parallel structure, which can process multi-dimensional acoustic features extracted from audio samples, and fuse the prediction results of different channels through a trainable full connection layer. It uses transfer learning to obtain faster convergence speed, and introduces data augmentation to improve the classification accuracy. The k-fold cross-validation method was used to segment the data set to comprehensively evaluate the prediction accuracy and robustness of the model. The evaluation results showed that the model can achieve a mean accuracy of 95.21% while maintaining a standard deviation of 0.65%. There was excellent consistency in performance over multiple tests.

摘要

由于深海环境能见度低，人们通常会采集和分析声信号，以探索海洋物种的行为。随着水下信号采集技术的进步，从海洋中获得的声数据量已经超过了人类手动处理的极限，因此设计高效的海洋哺乳动物分类算法已成为研究热点。在本文中，我们设计了一种基于多通道并行结构的分类模型，该模型可以处理从音频样本中提取的多维声特征，并通过可训练的全连接层融合不同通道的预测结果。它使用迁移学习来获得更快的收敛速度，并引入数据增强来提高分类精度。我们使用 k 折交叉验证方法对数据集进行分割，以全面评估模型的预测精度和鲁棒性。评估结果表明，该模型在保持 0.65 的标准差的同时，平均准确率可达 95.21%。在多次测试中表现出了出色的一致性。