Suppr超能文献

SyMIL:用于弱标记数据的最小-最大潜在支持向量机

SyMIL: MinMax Latent SVM for Weakly Labeled Data.

作者信息

Durand Thibaut, Thome Nicolas, Cord Matthieu

出版信息

IEEE Trans Neural Netw Learn Syst. 2018 Dec;29(12):6099-6112. doi: 10.1109/TNNLS.2018.2820055. Epub 2018 Apr 23.

Abstract

Designing powerful models able to handle weakly labeled data are a crucial problem in machine learning. In this paper, we propose a new multiple instance learning (MIL) framework. Examples are represented as bags of instances, but we depart from standard MIL assumptions by introducing a symmetric strategy (SyMIL) that seeks discriminative instances in positive and negative bags. The idea is to use the instance the most distant from the hyper-plan to classify the bag. We provide a theoretical analysis featuring the generalization properties of our model. We derive a large margin formulation of our problem, which is cast as a difference of convex functions, and optimized using concave-convex procedure. We provide a primal version optimizing with stochastic subgradient descent and a dual version optimizing with one-slack cutting-plane. Successful experimental results are reported on standard MIL and weakly supervised object detection data sets: SyMIL significantly outperforms competitive methods (mi/MI/Latent-SVM), and gives very competitive performance compared to state-of-the-art works. We also analyze the selected instances of symmetric and asymmetric approaches on weakly supervised object detection and text classification tasks. Finally, we show complementarity of SyMIL with recent works on learning with label proportions on standard MIL data sets.

摘要

设计能够处理弱标记数据的强大模型是机器学习中的一个关键问题。在本文中,我们提出了一种新的多实例学习(MIL)框架。示例被表示为实例包,但我们通过引入一种对称策略(SyMIL)偏离了标准的MIL假设,该策略在正例包和负例包中寻找有区分力的实例。其思路是使用离超平面最远的实例来对包进行分类。我们提供了一项理论分析,突出了我们模型的泛化特性。我们推导出了问题的大间隔公式,将其转化为凸函数之差,并使用凹凸过程进行优化。我们提供了一个使用随机次梯度下降进行优化的原始版本和一个使用单松弛切割平面进行优化的对偶版本。在标准的MIL和弱监督目标检测数据集上报告了成功的实验结果:SyMIL显著优于竞争方法(mi/MI/Latent-SVM),并且与最新的研究成果相比具有非常有竞争力的性能。我们还分析了在弱监督目标检测和文本分类任务中对称和非对称方法所选择的实例。最后,我们展示了SyMIL与最近关于在标准MIL数据集上按标签比例学习的研究成果的互补性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验