通过多任务学习和多门专家混合推进稳健的水下声学目标识别

Advancing robust underwater acoustic target recognition through multitask learning and multi-gate mixture of experts.

作者信息

Xie Yuan, Ren Jiawei, Li Junfeng, Xu Ji

机构信息

Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China.

University of Chinese Academy of Sciences, Beijing 100190, China.

出版信息

J Acoust Soc Am. 2024 Jul 1;156(1):244-255. doi: 10.1121/10.0026481.

DOI:10.1121/10.0026481

PMID:38980097

Abstract

Underwater acoustic target recognition has emerged as a prominent research area within the field of underwater acoustics. However, the current availability of authentic underwater acoustic signal recordings remains limited, which hinders data-driven acoustic recognition models from learning robust patterns of targets from a limited set of intricate underwater signals, thereby compromising their stability in practical applications. To overcome these limitations, this study proposes a recognition framework called M3 (multitask, multi-gate, multi-expert) to enhance the model's ability to capture robust patterns by making it aware of the inherent properties of targets. In this framework, an auxiliary task that focuses on target properties, such as estimating target size, is designed. The auxiliary task then shares parameters with the recognition task to realize multitask learning. This paradigm allows the model to concentrate on shared information across tasks and identify robust patterns of targets in a regularized manner, thus, enhancing the model's generalization ability. Moreover, M3 incorporates multi-expert and multi-gate mechanisms, allowing for the allocation of distinct parameter spaces to various underwater signals. This enables the model to process intricate signal patterns in a fine-grained and differentiated manner. To evaluate the effectiveness of M3, extensive experiments were implemented on the ShipsEar underwater ship-radiated noise dataset. The results substantiate that M3 has the ability to outperform the most advanced single-task recognition models, thereby achieving the state-of-the-art performance.

摘要

水下声学目标识别已成为水下声学领域的一个重要研究方向。然而，目前真实水下声学信号记录的可用性仍然有限，这阻碍了数据驱动的声学识别模型从有限的一组复杂水下信号中学习目标的稳健模式，从而影响了它们在实际应用中的稳定性。为了克服这些限制，本研究提出了一种名为M3（多任务、多门控、多专家）的识别框架，通过使模型了解目标的固有属性来增强其捕捉稳健模式的能力。在这个框架中，设计了一个专注于目标属性（如估计目标大小）的辅助任务。然后，辅助任务与识别任务共享参数以实现多任务学习。这种范式允许模型专注于跨任务的共享信息，并以正则化的方式识别目标的稳健模式，从而提高模型的泛化能力。此外，M3包含多专家和多门控机制，允许为各种水下信号分配不同的参数空间。这使模型能够以细粒度和差异化的方式处理复杂的信号模式。为了评估M3的有效性，在ShipsEar水下船舶辐射噪声数据集上进行了广泛的实验。结果证实，M3有能力超越最先进的单任务识别模型，从而实现了当前的最佳性能。