Suppr超能文献

声学场景分类中的在线持续学习:一项实证研究。

Online Continual Learning in Acoustic Scene Classification: An Empirical Study.

作者信息

Ha Donghee, Kim Mooseop, Jeong Chi Yoon

机构信息

Artificial Intelligence Research Laboratory, Electronics and Telecommunications Research Institute, 218 Gajeong-ro, Daejeon 34129, Republic of Korea.

Artificial Intelligence, University of Science and Technology, 217 Gajeong-ro, Daejeon 34113, Republic of Korea.

出版信息

Sensors (Basel). 2023 Aug 3;23(15):6893. doi: 10.3390/s23156893.

Abstract

Numerous deep learning methods for acoustic scene classification (ASC) have been proposed to improve the classification accuracy of sound events. However, only a few studies have focused on continual learning (CL) wherein a model continually learns to solve issues with task changes. Therefore, in this study, we systematically analyzed the performance of ten recent CL methods to provide guidelines regarding their performances. The CL methods included two regularization-based methods and eight replay-based methods. First, we defined realistic and difficult scenarios such as online class-incremental (OCI) and online domain-incremental (ODI) cases for three public sound datasets. Then, we systematically analyzed the performance of each CL method in terms of average accuracy, average forgetting, and training time. In OCI scenarios, iCaRL and SCR showed the best performance for small buffer sizes, and GDumb showed the best performance for large buffer sizes. In ODI scenarios, SCR adopting supervised contrastive learning consistently outperformed the other methods, regardless of the memory buffer size. Most replay-based methods have an almost constant training time, regardless of the memory buffer size, and their performance increases with an increase in the memory buffer size. Based on these results, we must first consider GDumb/SCR for the continual learning methods for ASC.

摘要

为了提高声音事件的分类准确率,人们提出了许多用于声学场景分类(ASC)的深度学习方法。然而,只有少数研究关注持续学习(CL),即模型持续学习以解决任务变化带来的问题。因此,在本研究中,我们系统地分析了十种近期的持续学习方法的性能,以提供有关它们性能的指导。这些持续学习方法包括两种基于正则化的方法和八种基于重放的方法。首先,我们为三个公共声音数据集定义了现实且具有挑战性的场景,如在线类别增量(OCI)和在线域增量(ODI)情况。然后,我们从平均准确率、平均遗忘率和训练时间方面系统地分析了每种持续学习方法的性能。在OCI场景中,对于小缓冲区大小,iCaRL和SCR表现最佳,而对于大缓冲区大小,GDumb表现最佳。在ODI场景中,采用监督对比学习的SCR无论内存缓冲区大小如何,始终优于其他方法。大多数基于重放的方法,无论内存缓冲区大小如何,训练时间几乎恒定,并且它们的性能随着内存缓冲区大小的增加而提高。基于这些结果,对于ASC持续学习方法,我们首先应考虑GDumb/SCR。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/287b/10422258/41c0253401bf/sensors-23-06893-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验