Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, s/n, 4200-465 Porto, Portugal.
Bosch Security Systems S.A., EN109-Zona Industrial de Ovar, 3880-080 Ovar, Portugal.
Sensors (Basel). 2022 Feb 16;22(4):1535. doi: 10.3390/s22041535.
The analysis of ambient sounds can be very useful when developing sound base intelligent systems. Acoustic scene classification (ASC) is defined as identifying the area of a recorded sound or clip among some predefined scenes. ASC has huge potential to be used in urban sound event classification systems. This research presents a hybrid method that includes a novel mathematical fusion step which aims to tackle the challenges of ASC accuracy and adaptability of current state-of-the-art models. The proposed method uses a stereo signal, two ensemble classifiers (random subspace), and a novel mathematical fusion step. In the proposed method, a stable, invariant signal representation of the stereo signal is built using Wavelet Scattering Transform (WST). For each mono, i.e., left and right, channel, a different random subspace classifier is trained using WST. A novel mathematical formula for fusion step was developed, its parameters being found using a Genetic algorithm. The results on the DCASE 2017 dataset showed that the proposed method has higher classification accuracy (about 95%), pushing the boundaries of existing methods.
当开发基于声音的智能系统时,环境声音的分析可能非常有用。声学场景分类(ASC)的定义是在一些预定义的场景中识别记录声音或剪辑的区域。ASC 具有巨大的潜力可用于城市声音事件分类系统。本研究提出了一种混合方法,其中包括一个新的数学融合步骤,旨在解决当前最先进模型的 ASC 准确性和适应性的挑战。所提出的方法使用立体声信号、两个集成分类器(随机子空间)和一个新的数学融合步骤。在提出的方法中,使用小波散射变换(WST)构建立体声信号的稳定、不变的信号表示。对于每个单声道,即左声道和右声道,使用 WST 训练不同的随机子空间分类器。开发了一种新的融合步骤数学公式,其参数使用遗传算法找到。在 DCASE 2017 数据集上的结果表明,所提出的方法具有更高的分类准确性(约 95%),推动了现有方法的边界。