Suppr超能文献

基于深度递归神经网络的声异常检测自编码器。

Deep Recurrent Neural Network-Based Autoencoders for Acoustic Novelty Detection.

机构信息

Machine Intelligence & Signal Processing Group, Technische Universität München, Munich, Germany; audEERING GmbH, Gilching, Germany; Chair of Complex & Intelligent Systems, University of Passau, Passau, Germany.

A3LAB, Department of Information Engineering, Università Politecnica delle Marche, Ancona, Italy.

出版信息

Comput Intell Neurosci. 2017;2017:4694860. doi: 10.1155/2017/4694860. Epub 2017 Jan 15.

Abstract

In the emerging field of acoustic novelty detection, most research efforts are devoted to probabilistic approaches such as mixture models or state-space models. Only recent studies introduced (pseudo-)generative models for acoustic novelty detection with recurrent neural networks in the form of an autoencoder. In these approaches, auditory spectral features of the next short term frame are predicted from the previous frames by means of Long-Short Term Memory recurrent denoising autoencoders. The reconstruction error between the input and the output of the autoencoder is used as activation signal to detect novel events. There is no evidence of studies focused on comparing previous efforts to automatically recognize novel events from audio signals and giving a broad and in depth evaluation of recurrent neural network-based autoencoders. The present contribution aims to consistently evaluate our recent novel approaches to fill this white spot in the literature and provide insight by extensive evaluations carried out on three databases: A3Novelty, PASCAL CHiME, and PROMETHEUS. Besides providing an extensive analysis of novel and state-of-the-art methods, the article shows how RNN-based autoencoders outperform statistical approaches up to an absolute improvement of 16.4% average -measure over the three databases.

摘要

在新兴的声学新颖性检测领域,大多数研究工作都致力于概率方法,如混合模型或状态空间模型。只有最近的研究才引入了(伪)生成模型,通过递归神经网络以自动编码器的形式进行声学新颖性检测。在这些方法中,通过长短期记忆递归去噪自动编码器,从先前的帧预测下一个短期帧的听觉频谱特征。自动编码器的输入和输出之间的重建误差用作激活信号来检测新事件。没有证据表明有研究集中于比较以前的努力,以自动从音频信号中识别新颖事件,并对基于递归神经网络的自动编码器进行广泛而深入的评估。本研究旨在一致地评估我们最近的新颖方法,以填补文献中的空白,并通过在三个数据库上进行广泛的评估提供深入的见解:A3Novelty、PASCAL CHiME 和 PROMETHEUS。除了对新颖和最先进的方法进行广泛分析外,本文还展示了基于 RNN 的自动编码器如何在三个数据库上平均提高 16.4%的绝对指标,优于统计方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb6d/5274684/f9d11f441b2d/CIN2017-4694860.001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验