Li Dongyuan, Wang Zhen, Chen Yankai, Jiang Renhe, Ding Weiping, Okumura Manabu
IEEE Trans Neural Netw Learn Syst. 2025 Apr;36(4):5879-5899. doi: 10.1109/TNNLS.2024.3396463. Epub 2025 Apr 4.
Active learning seeks to achieve strong performance with fewer training samples. It does this by iteratively asking an oracle to label newly selected samples in a human-in-the-loop manner. This technique has gained increasing popularity due to its broad applicability, yet its survey papers, especially for deep active learning (DAL), remain scarce. Therefore, we conduct an advanced and comprehensive survey on DAL. We first introduce reviewed paper collection and filtering. Second, we formally define the DAL task and summarize the most influential baselines and widely used datasets. Third, we systematically provide a taxonomy of DAL methods from five perspectives, including annotation types, query strategies, deep model architectures, learning paradigms, and training processes, and objectively analyze their strengths and weaknesses. Then, we comprehensively summarize the main applications of DAL in natural language processing (NLP), computer vision (CV), data mining (DM), and so on. Finally, we discuss challenges and perspectives after a detailed analysis of current studies. This work aims to serve as a useful and quick guide for researchers in overcoming difficulties in DAL. We hope that this survey will spur further progress in this burgeoning field.
主动学习旨在用更少的训练样本来实现强大的性能。它通过以人在回路的方式迭代地请求神谕对新选择的样本进行标注来做到这一点。由于其广泛的适用性,这种技术越来越受欢迎,但其综述论文,特别是关于深度主动学习(DAL)的,仍然很少。因此,我们对深度主动学习进行了一次全面且深入的综述。我们首先介绍综述论文的收集和筛选。其次,我们正式定义深度主动学习任务,并总结最具影响力的基线和广泛使用的数据集。第三,我们从五个角度系统地提供深度主动学习方法的分类法,包括标注类型、查询策略、深度模型架构、学习范式和训练过程,并客观地分析它们的优缺点。然后,我们全面总结深度主动学习在自然语言处理(NLP)、计算机视觉(CV)、数据挖掘(DM)等方面的主要应用。最后,在对当前研究进行详细分析后,我们讨论挑战和前景。这项工作旨在为研究人员克服深度主动学习中的困难提供一份有用且快速的指南。我们希望这项综述能推动这个新兴领域的进一步发展。