• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于全集学习的半监督文本分类

Semi-Supervised Text Classification With Universum Learning.

出版信息

IEEE Trans Cybern. 2016 Feb;46(2):462-73. doi: 10.1109/TCYB.2015.2403573. Epub 2015 Feb 27.

DOI:10.1109/TCYB.2015.2403573
PMID:25730839
Abstract

Universum, a collection of nonexamples that do not belong to any class of interest, has become a new research topic in machine learning. This paper devises a semi-supervised learning with Universum algorithm based on boosting technique, and focuses on situations where only a few labeled examples are available. We also show that the training error of AdaBoost with Universum is bounded by the product of normalization factor, and the training error drops exponentially fast when each weak classifier is slightly better than random guessing. Finally, the experiments use four data sets with several combinations. Experimental results indicate that the proposed algorithm can benefit from Universum examples and outperform several alternative methods, particularly when insufficient labeled examples are available. When the number of labeled examples is insufficient to estimate the parameters of classification functions, the Universum can be used to approximate the prior distribution of the classification functions. The experimental results can be explained using the concept of Universum introduced by Vapnik, that is, Universum examples implicitly specify a prior distribution on the set of classification functions.

摘要

非例全集(Universum)作为不属于任何特定感兴趣类别的样本集合,已成为机器学习领域的一个新的研究课题。本文基于提升技术设计了一种基于 Universum 的半监督学习算法,并重点关注仅有少量标记示例的情况。我们还表明,具有 Universum 的 AdaBoost 的训练误差受规范化因子的乘积限制,并且当每个弱分类器略优于随机猜测时,训练误差会呈指数级快速下降。最后,实验使用了四个具有多种组合的数据集。实验结果表明,所提出的算法可以从 Universum 示例中受益,并优于几种替代方法,特别是在可用的标记示例不足的情况下。当标记示例的数量不足以估计分类函数的参数时,可以使用 Universum 来近似分类函数的先验分布。实验结果可以用 Vapnik 提出的 Universum 概念来解释,即 Universum 示例隐含地指定了分类函数集上的先验分布。

相似文献

1
Semi-Supervised Text Classification With Universum Learning.基于全集学习的半监督文本分类
IEEE Trans Cybern. 2016 Feb;46(2):462-73. doi: 10.1109/TCYB.2015.2403573. Epub 2015 Feb 27.
2
UBoost: boosting with the Universum.UBoost:利用 Universum 进行boosting。
IEEE Trans Pattern Anal Mach Intell. 2012 Apr;34(4):825-32. doi: 10.1109/TPAMI.2011.240.
3
Twin support vector machine with Universum data.基于 Universum 数据的孪生支持向量机。
Neural Netw. 2012 Dec;36:112-9. doi: 10.1016/j.neunet.2012.09.004. Epub 2012 Oct 3.
4
Universum based Lagrangian twin bounded support vector machine to classify EEG signals.基于 Universum 的拉格朗日对偶边界支持向量机分类 EEG 信号。
Comput Methods Programs Biomed. 2021 Sep;208:106244. doi: 10.1016/j.cmpb.2021.106244. Epub 2021 Jun 24.
5
Semi-supervised linear discriminant clustering.半监督线性判别聚类。
IEEE Trans Cybern. 2014 Jul;44(7):989-1000. doi: 10.1109/TCYB.2013.2278466. Epub 2013 Aug 27.
6
SemiBoost: boosting for semi-supervised learning.半增强算法:用于半监督学习的增强算法
IEEE Trans Pattern Anal Mach Intell. 2009 Nov;31(11):2000-14. doi: 10.1109/TPAMI.2008.235.
7
Inverse free reduced universum twin support vector machine for imbalanced data classification.用于不平衡数据分类的逆自由约简全域孪生支持向量机
Neural Netw. 2023 Jan;157:125-135. doi: 10.1016/j.neunet.2022.10.003. Epub 2022 Oct 15.
8
Maximum margin semi-supervised learning with irrelevant data.最大间隔半监督学习与无关数据。
Neural Netw. 2015 Oct;70:90-102. doi: 10.1016/j.neunet.2015.06.004. Epub 2015 Jul 15.
9
Universum-Inspired Supervised Contrastive Learning.受通用集启发的监督对比学习
IEEE Trans Image Process. 2023;32:4275-4286. doi: 10.1109/TIP.2023.3290514. Epub 2023 Jul 27.
10
SEG-SSC: a framework based on synthetic examples generation for self-labeled semi-supervised classification.SEG-SSC:一种基于合成示例生成的自标记半监督分类框架。
IEEE Trans Cybern. 2015 Apr;45(4):622-34. doi: 10.1109/TCYB.2014.2332003. Epub 2014 Jul 1.

引用本文的文献

1
A review of semi-supervised learning for text classification.文本分类的半监督学习综述。
Artif Intell Rev. 2023 Jan 31:1-69. doi: 10.1007/s10462-023-10393-8.
2
Construction of English and American Literature Corpus Based on Machine Learning Algorithm.基于机器学习算法的英美文学语料库构建。
Comput Intell Neurosci. 2022 Jun 2;2022:9773452. doi: 10.1155/2022/9773452. eCollection 2022.
3
CLAD: A corpus-derived Chinese Lexical Association Database.CLAD:基于语料库的汉语词汇联想数据库。
Behav Res Methods. 2019 Oct;51(5):2310-2336. doi: 10.3758/s13428-019-01208-2.