Suppr超能文献

具有重叠区域的分层主动学习

Hierarchical Active Learning with Overlapping Regions.

作者信息

Luo Zhipeng, Hauskrecht Milos

机构信息

Department of Computer Science, University of Pittsburgh, Pittsburgh, Pennsylvania.

出版信息

Proc ACM Int Conf Inf Knowl Manag. 2020 Oct;2020:1045-1054. doi: 10.1145/3340531.3412022.

Abstract

Learning of classification models from real-world data often requires substantial human effort devoted to annotation. As this process can be very time-consuming and costly, finding effective ways to reduce the annotation cost becomes critical for building such models. To address this problem we explore a new type of human feedback - -based feedback. Briefly, a region is defined as a hypercubic subspace of the input data space and represents a of data instances; the region's label is a human assessment of the class of the data subpopulation. By using algorithms one can learn instance-based classifiers from such labeled regions. In general, the key challenge is that there can be infinite many regions one can define and query in a given data space. To minimize the number and complexity of region-based queries, we propose and develop a solution that aims at incrementally building a hierarchy of regions. Furthermore, to avoid building a possibly class-irrelevant region hierarchy, we further propose to grow multiple different hierarchies in parallel and expand those more informative hierarchies. Through experiments on numerous data sets, we demonstrate that methods using region-based feedback can learn very good classifiers from very few and simple queries, and hence are highly effective in reducing human annotation effort needed for building classification models.

摘要

从现实世界数据中学习分类模型通常需要投入大量人力进行标注。由于这个过程可能非常耗时且成本高昂,因此找到有效的方法来降低标注成本对于构建此类模型至关重要。为了解决这个问题,我们探索了一种新型的基于人类反馈的反馈。简而言之,一个区域被定义为输入数据空间的超立方子空间,并表示一组数据实例;该区域的标签是人类对数据子群体类别的评估。通过使用算法,可以从此类带标签的区域中学习基于实例的分类器。一般来说,关键挑战在于在给定的数据空间中可以定义和查询无限多个区域。为了最小化基于区域的查询数量和复杂性,我们提出并开发了一种解决方案,旨在逐步构建区域的层次结构。此外,为了避免构建可能与类别无关的区域层次结构,我们进一步建议并行增长多个不同的层次结构,并扩展那些信息更丰富的层次结构。通过在众多数据集上进行实验,我们证明了使用基于区域反馈的方法可以从非常少且简单的查询中学习到非常好的分类器,因此在减少构建分类模型所需的人工标注工作量方面非常有效。

相似文献

1
Hierarchical Active Learning with Overlapping Regions.具有重叠区域的分层主动学习
Proc ACM Int Conf Inf Knowl Manag. 2020 Oct;2020:1045-1054. doi: 10.1145/3340531.3412022.
2
Hierarchical Active Learning with Proportion Feedback on Regions.基于区域比例反馈的分层主动学习
Mach Learn Knowl Discov Databases. 2019;11052:464-480. doi: 10.1007/978-3-030-10928-8_28. Epub 2019 Jan 23.
5
Hierarchical Active Learning With Qualitative Feedback on Regions.基于区域定性反馈的分层主动学习
IEEE Trans Hum Mach Syst. 2023 Jun;53(3):581-589. doi: 10.1109/thms.2023.3252815. Epub 2023 Mar 23.
9
Dynamic Programming for Instance Annotation in Multi-Instance Multi-Label Learning.动态规划在多实例多标签学习中的实例标注。
IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2381-2394. doi: 10.1109/TPAMI.2017.2647944. Epub 2017 Jan 5.
10
Co-Labeling for Multi-View Weakly Labeled Learning.多视图弱标签学习的联合标记。
IEEE Trans Pattern Anal Mach Intell. 2016 Jun;38(6):1113-25. doi: 10.1109/TPAMI.2015.2476813. Epub 2015 Sep 4.

引用本文的文献

1
Hierarchical Active Learning with Label Proportions on Data Regions.基于数据区域标签比例的分层主动学习
IEEE Trans Knowl Data Eng. 2024 Dec;36(12):8434-8446. doi: 10.1109/tkde.2024.3419588.
2
Hierarchical Active Learning With Qualitative Feedback on Regions.基于区域定性反馈的分层主动学习
IEEE Trans Hum Mach Syst. 2023 Jun;53(3):581-589. doi: 10.1109/thms.2023.3252815. Epub 2023 Mar 23.

本文引用的文献

3
Hierarchical Active Learning with Proportion Feedback on Regions.基于区域比例反馈的分层主动学习
Mach Learn Knowl Discov Databases. 2019;11052:464-480. doi: 10.1007/978-3-030-10928-8_28. Epub 2019 Jan 23.
6
Learning classification with auxiliary probabilistic information.利用辅助概率信息进行学习分类。
Proc IEEE Int Conf Data Min. 2011;2011:477-486. doi: 10.1109/ICDM.2011.84.
7
Learning classification models with soft-label information.学习带有软标签信息的分类模型。
J Am Med Inform Assoc. 2014 May-Jun;21(3):501-8. doi: 10.1136/amiajnl-2013-001964. Epub 2013 Nov 20.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验