Suppr超能文献

一种全球变化局部恒定的模型,用于融合来自多个不同专家的标签,而无需使用参考标签。

A globally-variant locally-constant model for fusion of labels from multiple diverse experts without using reference labels.

机构信息

Electrical Engineering Department, University of Southern California, 3740 McClintock Avenue, Los Angeles, CA 90089-2564, USA.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2013 Apr;35(4):769-83. doi: 10.1109/TPAMI.2012.139.

Abstract

Researchers have shown that fusion of categorical labels from multiple experts—humans or machine classifiers—improves the accuracy and generalizability of the overall classification system. Simple plurality is a popular technique for performing this fusion, but it gives equal importance to labels from all experts, who may not be equally reliable or consistent across the dataset. Estimation of expert reliability without knowing the reference labels is, however, a challenging problem. Most previous works deal with these challenges by modeling expert reliability as constant over the entire data (feature) space. This paper presents a model based on the consideration that in dealing with real-world data, expert reliability is variable over the complete feature space but constant over local clusters of homogeneous instances. This model jointly learns a classifier and expert reliability parameters without assuming knowledge of the reference labels using the Expectation-Maximization (EM) algorithm. Classification experiments on simulated data, data from the UCI Machine Learning Repository, and two emotional speech classification datasets show the benefits of the proposed model. Using a metric based on the Jensen-Shannon divergence, we empirically show that the proposed model gives greater benefit for datasets where expert reliability is highly variable over the feature space.

摘要

研究人员已经表明,融合来自多个专家的分类标签——人类或机器分类器——可以提高整体分类系统的准确性和泛化能力。简单多数是执行这种融合的一种流行技术,但它对所有专家的标签同等重视,而这些专家在整个数据集上可能并不具有同等的可靠性或一致性。然而,在不知道参考标签的情况下估计专家的可靠性是一个具有挑战性的问题。大多数先前的工作通过将专家可靠性建模为在整个数据(特征)空间上保持不变来应对这些挑战。本文提出了一种模型,其考虑到在处理真实世界的数据时,专家可靠性在整个特征空间上是可变的,但在同质实例的局部聚类上是不变的。该模型使用期望最大化(EM)算法,在不假设参考标签知识的情况下,联合学习分类器和专家可靠性参数。在模拟数据、UCI 机器学习知识库中的数据以及两个情感语音分类数据集上的分类实验表明了该模型的优势。使用基于 Jensen-Shannon 散度的度量,我们从经验上证明,对于专家可靠性在特征空间上高度变化的数据集,所提出的模型带来了更大的好处。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验