• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从不平衡数据中进行主动学习:在线加权极限学习机的一种解决方案。

Active Learning From Imbalanced Data: A Solution of Online Weighted Extreme Learning Machine.

作者信息

Yu Hualong, Yang Xibei, Zheng Shang, Sun Changyin

出版信息

IEEE Trans Neural Netw Learn Syst. 2019 Apr;30(4):1088-1103. doi: 10.1109/TNNLS.2018.2855446. Epub 2018 Aug 21.

DOI:10.1109/TNNLS.2018.2855446
PMID:30137013
Abstract

It is well known that active learning can simultaneously improve the quality of the classification model and decrease the complexity of training instances. However, several previous studies have indicated that the performance of active learning is easily disrupted by an imbalanced data distribution. Some existing imbalanced active learning approaches also suffer from either low performance or high time consumption. To address these problems, this paper describes an efficient solution based on the extreme learning machine (ELM) classification model, called active online-weighted ELM (AOW-ELM). The main contributions of this paper include: 1) the reasons why active learning can be disrupted by an imbalanced instance distribution and its influencing factors are discussed in detail; 2) the hierarchical clustering technique is adopted to select initially labeled instances in order to avoid the missed cluster effect and cold start phenomenon as much as possible; 3) the weighted ELM (WELM) is selected as the base classifier to guarantee the impartiality of instance selection in the procedure of active learning, and an efficient online updated mode of WELM is deduced in theory; and 4) an early stopping criterion that is similar to but more flexible than the margin exhaustion criterion is presented. The experimental results on 32 binary-class data sets with different imbalance ratios demonstrate that the proposed AOW-ELM algorithm is more effective and efficient than several state-of-the-art active learning algorithms that are specifically designed for the class imbalance scenario.

摘要

众所周知,主动学习可以同时提高分类模型的质量并降低训练实例的复杂度。然而,先前的一些研究表明,主动学习的性能很容易受到不平衡数据分布的干扰。一些现有的不平衡主动学习方法还存在性能低或时间消耗高的问题。为了解决这些问题,本文描述了一种基于极限学习机(ELM)分类模型的有效解决方案,称为主动在线加权ELM(AOW-ELM)。本文的主要贡献包括:1)详细讨论了主动学习为何会受到不平衡实例分布干扰及其影响因素;2)采用层次聚类技术来选择初始标记实例,以尽可能避免错过聚类效应和冷启动现象;3)选择加权ELM(WELM)作为基础分类器,以保证主动学习过程中实例选择的公正性,并从理论上推导了WELM的一种高效在线更新模式;4)提出了一种类似于但比余量耗尽准则更灵活的提前停止准则。在32个具有不同不平衡率的二分类数据集上的实验结果表明,所提出的AOW-ELM算法比几种专门为类不平衡场景设计的现有主动学习算法更有效、更高效。

相似文献

1
Active Learning From Imbalanced Data: A Solution of Online Weighted Extreme Learning Machine.从不平衡数据中进行主动学习:在线加权极限学习机的一种解决方案。
IEEE Trans Neural Netw Learn Syst. 2019 Apr;30(4):1088-1103. doi: 10.1109/TNNLS.2018.2855446. Epub 2018 Aug 21.
2
Online sequential class-specific extreme learning machine for binary imbalanced learning.在线序贯类特定极端学习机用于二进制不平衡学习。
Neural Netw. 2019 Nov;119:235-248. doi: 10.1016/j.neunet.2019.08.018. Epub 2019 Aug 23.
3
Class-specific extreme learning machine for handling binary class imbalance problem.用于处理二分类不平衡问题的特定类别的极限学习机。
Neural Netw. 2018 Sep;105:206-217. doi: 10.1016/j.neunet.2018.05.011. Epub 2018 May 22.
4
Classification of imbalanced bioinformatics data by using boundary movement-based ELM.基于边界移动的极限学习机对不平衡生物信息学数据的分类
Biomed Mater Eng. 2015;26 Suppl 1:S1855-62. doi: 10.3233/BME-151488.
5
Meta-cognitive online sequential extreme learning machine for imbalanced and concept-drifting data classification.用于不平衡和概念漂移数据分类的元认知在线序列极限学习机
Neural Netw. 2016 Aug;80:79-94. doi: 10.1016/j.neunet.2016.04.008. Epub 2016 Apr 28.
6
Research on air pollutant concentration prediction method based on self-adaptive neuro-fuzzy weighted extreme learning machine.基于自适应神经模糊加权极限学习机的空气污染物浓度预测方法研究。
Environ Pollut. 2018 Oct;241:1115-1127. doi: 10.1016/j.envpol.2018.05.072. Epub 2018 Jun 23.
7
Imbalanced biomedical data classification using self-adaptive multilayer ELM combined with dynamic GAN.基于自适应多层 ELM 与动态 GAN 结合的生物医学数据不平衡分类。
Biomed Eng Online. 2018 Dec 4;17(1):181. doi: 10.1186/s12938-018-0604-3.
8
An efficient computational method for predicting drug-target interactions using weighted extreme learning machine and speed up robot features.一种使用加权极限学习机和加速机器人特征预测药物-靶点相互作用的高效计算方法。
BioData Min. 2021 Jan 20;14(1):3. doi: 10.1186/s13040-021-00242-1.
9
[Drug discrimination by near infrared spectroscopy based on summation wavelet extreme learning machine].基于求和小波极限学习机的近红外光谱药物鉴别
Guang Pu Xue Yu Guang Pu Fen Xi. 2014 Oct;34(10):2815-20.
10
Discriminative clustering via extreme learning machine.基于极端学习机的判别聚类。
Neural Netw. 2015 Oct;70:1-8. doi: 10.1016/j.neunet.2015.06.002. Epub 2015 Jun 19.

引用本文的文献

1
Rethinking Domain-Specific Pretraining by Supervised or Self-Supervised Learning for Chest Radiograph Classification: A Comparative Study Against ImageNet Counterparts in Cold-Start Active Learning.通过监督学习或自监督学习对胸部X光片分类进行特定领域预训练的再思考:与冷启动主动学习中的ImageNet对应模型的比较研究
Health Care Sci. 2025 Apr 6;4(2):110-143. doi: 10.1002/hcs2.70009. eCollection 2025 Apr.
2
Development and validation of prediction models for hypertension risks: A cross-sectional study based on 4,287,407 participants.高血压风险预测模型的开发与验证:一项基于4287407名参与者的横断面研究。
Front Cardiovasc Med. 2022 Sep 26;9:928948. doi: 10.3389/fcvm.2022.928948. eCollection 2022.
3
Novel Insights on Establishing Machine Learning-Based Stroke Prediction Models Among Hypertensive Adults.
关于在高血压成年人中建立基于机器学习的中风预测模型的新见解。
Front Cardiovasc Med. 2022 May 6;9:901240. doi: 10.3389/fcvm.2022.901240. eCollection 2022.
4
Simulations to Assess the Performance of Multifactor Risk Scores for Predicting Myopia Prevalence in Children and Adolescents in China.评估多因素风险评分在中国儿童和青少年近视患病率预测中表现的模拟研究
Front Genet. 2022 Apr 11;13:861164. doi: 10.3389/fgene.2022.861164. eCollection 2022.
5
A Machine Learning Based Framework to Identify and Classify Non-alcoholic Fatty Liver Disease in a Large-Scale Population.基于机器学习的大型人群中非酒精性脂肪肝识别和分类框架。
Front Public Health. 2022 Apr 4;10:846118. doi: 10.3389/fpubh.2022.846118. eCollection 2022.
6
Machine learning model for predicting acute kidney injury progression in critically ill patients.用于预测危重症患者急性肾损伤进展的机器学习模型。
BMC Med Inform Decis Mak. 2022 Jan 19;22(1):17. doi: 10.1186/s12911-021-01740-2.
7
Comparison of the Meta-Active Machine Learning Model Applied to Biological Data-Driven Experiments with Other Models.比较元主动机器学习模型在生物数据驱动实验中的应用与其他模型。
J Healthc Eng. 2021 Dec 13;2021:8014850. doi: 10.1155/2021/8014850. eCollection 2021.
8
Active Semisupervised Model for Improving the Identification of Anticancer Peptides.用于改进抗癌肽识别的主动半监督模型
ACS Omega. 2021 Sep 8;6(37):23998-24008. doi: 10.1021/acsomega.1c03132. eCollection 2021 Sep 21.
9
Identification of Potential Type II Diabetes in a Large-Scale Chinese Population Using a Systematic Machine Learning Framework.利用系统机器学习框架在中国大规模人群中识别潜在的 2 型糖尿病。
J Diabetes Res. 2020 Sep 24;2020:6873891. doi: 10.1155/2020/6873891. eCollection 2020.
10
Active semi-supervised learning for biological data classification.生物数据分类的主动半监督学习。
PLoS One. 2020 Aug 19;15(8):e0237428. doi: 10.1371/journal.pone.0237428. eCollection 2020.