• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于双层协同聚类的众包标签聚合。

Crowdsourced Label Aggregation Using Bilayer Collaborative Clustering.

出版信息

IEEE Trans Neural Netw Learn Syst. 2019 Oct;30(10):3172-3185. doi: 10.1109/TNNLS.2018.2890148. Epub 2019 Jan 25.

DOI:10.1109/TNNLS.2018.2890148
PMID:30703041
Abstract

With online crowdsourcing platforms, labels can be acquired at relatively low costs from massive nonexpert workers. To improve the quality of labels obtained from these imperfect crowdsourced workers, we usually let different workers provide labels for the same instance. Then, the true labels for all instances are estimated from these multiple noisy labels. This traditional general-purpose label aggregation process, solely relying on the collected noisy labels, cannot significantly improve the accuracy of integrated labels under a low labeling quality circumstance. This paper proposes a novel bilayer collaborative clustering (BLCC) method for the label aggregation in crowdsourcing. BLCC first generates the conceptual-level features for the instances from their multiple noisy labels and infers the initially integrated labels by performing clustering on the conceptual-level features. Then, it performs another clustering on the physical-level features to form the estimations of the true labels on the physical layer. The clustering results on both layers can facilitate in tracking the changes in the uncertainties of the instances. Finally, the initially integrated labels that are likely to be wrongly inferred on the conceptual layer can be addressed using the estimated labels on the physical layer. The clustering processes on both layers can keep providing guidance information for each other in the multiple label remedy rounds. The experimental results on 12 real-world crowdsourcing data sets show that the performance of the proposed method in terms of accuracy is better than that of the state-of-the-art methods.

摘要

利用在线众包平台,可以从大量非专业人员那里以相对较低的成本获取标签。为了提高从这些不完美的众包工人那里获得的标签的质量,我们通常让不同的工人为同一个实例提供标签。然后,从这些多个有噪声的标签中估计所有实例的真实标签。这种传统的通用标签聚合过程仅依赖于收集到的有噪声的标签,在低标注质量的情况下,无法显著提高集成标签的准确性。本文提出了一种新颖的双层协同聚类(BLCC)方法,用于众包中的标签聚合。BLCC 首先从多个有噪声的标签中为实例生成概念级特征,并通过对概念级特征进行聚类来推断初始集成标签。然后,它对物理层特征进行另一次聚类,以形成物理层上的真实标签的估计。在两层上的聚类结果有助于跟踪实例不确定性的变化。最后,可以使用物理层上的估计标签来解决概念层上可能错误推断的初始集成标签。两层上的聚类过程可以在多个标签补救轮次中相互提供指导信息。在 12 个真实的众包数据集上的实验结果表明,所提出的方法在准确性方面的性能优于最先进的方法。

相似文献

1
Crowdsourced Label Aggregation Using Bilayer Collaborative Clustering.基于双层协同聚类的众包标签聚合。
IEEE Trans Neural Netw Learn Syst. 2019 Oct;30(10):3172-3185. doi: 10.1109/TNNLS.2018.2890148. Epub 2019 Jan 25.
2
Improving Crowdsourced Label Quality Using Noise Correction.利用噪声校正提高众包标签质量。
IEEE Trans Neural Netw Learn Syst. 2018 May;29(5):1675-1688. doi: 10.1109/TNNLS.2017.2677468. Epub 2017 Mar 22.
3
Learning From Crowds With Multiple Noisy Label Distribution Propagation.基于多噪声标签分布传播的众包学习
IEEE Trans Neural Netw Learn Syst. 2022 Nov;33(11):6558-6568. doi: 10.1109/TNNLS.2021.3082496. Epub 2022 Oct 27.
4
Active Crowdsourcing for Multilabel Annotation.用于多标签标注的主动众包
IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):3549-3559. doi: 10.1109/TNNLS.2022.3194022. Epub 2024 Feb 29.
5
Label Consistency-Based Ground Truth Inference for Crowdsourcing.基于标签一致性的众包真值推断
IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9408-9421. doi: 10.1109/TNNLS.2024.3438680. Epub 2025 May 2.
6
Active learning with imbalanced multiple noisy labeling.基于不平衡多噪声标注的主动学习
IEEE Trans Cybern. 2015 May;45(5):1081-93. doi: 10.1109/TCYB.2014.2344674. Epub 2014 Aug 14.
7
Beyond Majority Voting: A Coarse-to-Fine Label Filtration for Heavily Noisy Labels.超越多数投票:一种用于严重噪声标签的粗到细标签过滤方法。
IEEE Trans Neural Netw Learn Syst. 2019 Dec;30(12):3774-3787. doi: 10.1109/TNNLS.2019.2899045. Epub 2019 Mar 15.
8
Crowdsourcing for Machine Learning in Public Health Surveillance: Lessons Learned From Amazon Mechanical Turk.公共卫生监测中机器学习的众包:从亚马逊土耳其机器人学到的经验教训。
J Med Internet Res. 2022 Jan 18;24(1):e28749. doi: 10.2196/28749.
9
Max-Margin Majority Voting for Learning from Crowds.基于最大间隔多数投票的众包学习方法
IEEE Trans Pattern Anal Mach Intell. 2019 Oct;41(10):2480-2494. doi: 10.1109/TPAMI.2018.2860987. Epub 2018 Jul 31.
10
Progressive Stochastic Learning for Noisy Labels.针对噪声标签的渐进式随机学习
IEEE Trans Neural Netw Learn Syst. 2018 Oct;29(10):5136-5148. doi: 10.1109/TNNLS.2018.2792062. Epub 2018 Feb 5.

引用本文的文献

1
Chained Deep Learning Using Generalized Cross-Entropy for Multiple Annotators Classification.链式深度学习使用广义交叉熵进行多标注分类。
Sensors (Basel). 2023 Mar 28;23(7):3518. doi: 10.3390/s23073518.
2
Multi-Label Active Learning Algorithms for Image Classification: Overview and Future Promise.用于图像分类的多标签主动学习算法:概述与未来展望
ACM Comput Surv. 2020 Jun;53(2). doi: 10.1145/3379504. Epub 2020 Mar 13.