• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于未知类别数量的数据流的主动学习分类。

Active learning for classifying data streams with unknown number of classes.

机构信息

Department of Computing, Bournemouth University, Poole, UK; Institute Mines Telecom Lille Douai, Douai, France.

Institute Mines Telecom Lille Douai, Douai, France.

出版信息

Neural Netw. 2018 Feb;98:1-15. doi: 10.1016/j.neunet.2017.10.004. Epub 2017 Oct 27.

DOI:10.1016/j.neunet.2017.10.004
PMID:29145086
Abstract

The classification of data streams is an interesting but also a challenging problem. A data stream may grow infinitely making it impractical for storage prior to processing and classification. Due to its dynamic nature, the underlying distribution of the data stream may change over time resulting in the so-called concept drift or the possible emergence and fading of classes, known as concept evolution. In addition, acquiring labels of data samples in a stream is admittedly expensive if not infeasible at all. In this paper, we propose a novel stream-based active learning algorithm (SAL) which is capable of coping with both concept drift and concept evolution by adapting the classification model to the dynamic changes in the stream. SAL is the first AL algorithm in the literature to explicitly take account of these concepts. Moreover, using SAL, only labels of samples that are expected to reduce the expected future error are queried. This process is done while tackling the problem of sampling bias so that samples that induce the change (i.e., drifting samples or samples coming from new classes) are queried. To efficiently implement SAL, the paper proposes the application of non-parametric Bayesian models allowing to cope with the lack of prior knowledge about the data stream. In particular, Dirichlet mixture models and the stick breaking process are adopted and adapted to meet the requirements of online learning. The empirical results obtained on real-world benchmarks demonstrate the superiority of SAL in terms of classification performance over the state-of-the-art methods using average and average class accuracy.

摘要

数据流的分类是一个有趣但具有挑战性的问题。由于数据流可能会无限增长,因此在处理和分类之前进行存储是不切实际的。由于其动态性质,数据流的基础分布可能会随时间变化,从而导致所谓的概念漂移或类别的可能出现和消失,即概念演化。此外,如果不是完全不可能的话,在流中获取数据样本的标签也是非常昂贵的。在本文中,我们提出了一种新颖的基于流的主动学习算法(SAL),该算法能够通过自适应分类模型来应对数据流中的动态变化,从而应对概念漂移和概念演化。SAL 是文献中第一个明确考虑这些概念的 AL 算法。此外,SAL 只查询那些预计会减少未来预期误差的样本标签。在解决抽样偏差问题的同时,会查询到导致变化的样本(即漂移样本或来自新类别的样本)。为了有效地实现 SAL,本文提出了应用非参数贝叶斯模型的方法,以应对缺乏数据流先验知识的问题。特别是,采用了 Dirichlet 混合模型和棒断裂过程,并对其进行了调整,以满足在线学习的要求。在真实基准上获得的实证结果表明,SAL 在分类性能方面优于使用平均和平均类精度的最新方法。

相似文献

1
Active learning for classifying data streams with unknown number of classes.基于未知类别数量的数据流的主动学习分类。
Neural Netw. 2018 Feb;98:1-15. doi: 10.1016/j.neunet.2017.10.004. Epub 2017 Oct 27.
2
A Bi-Criteria Active Learning Algorithm for Dynamic Data Streams.一种用于动态数据流的双标准主动学习算法。
IEEE Trans Neural Netw Learn Syst. 2018 Jan;29(1):74-86. doi: 10.1109/TNNLS.2016.2614393. Epub 2016 Oct 21.
3
Active learning from stream data using optimal weight classifier ensemble.使用最优权重分类器集成从流数据中进行主动学习。
IEEE Trans Syst Man Cybern B Cybern. 2010 Dec;40(6):1607-21. doi: 10.1109/TSMCB.2010.2042445. Epub 2010 Apr 1.
4
Reacting to different types of concept drift: the Accuracy Updated Ensemble algorithm.应对不同类型的概念漂移:准确性更新集成算法。
IEEE Trans Neural Netw Learn Syst. 2014 Jan;25(1):81-94. doi: 10.1109/TNNLS.2013.2251352.
5
Classification of the drifting data streams using heterogeneous diversified dynamic class-weighted ensemble.使用异构多样化动态类加权集成对漂移数据流进行分类。
PeerJ Comput Sci. 2021 Apr 1;7:e459. doi: 10.7717/peerj-cs.459. eCollection 2021.
6
A dynamic ensemble framework for mining textual streams with class imbalance.一种用于挖掘具有类别不平衡的文本流的动态集成框架。
ScientificWorldJournal. 2014;2014:497354. doi: 10.1155/2014/497354. Epub 2014 Apr 10.
7
An Adaptive Deep Learning Framework for Dynamic Image Classification in the Internet of Things Environment.一种适用于物联网环境中动态图像分类的自适应深度学习框架。
Sensors (Basel). 2020 Oct 14;20(20):5811. doi: 10.3390/s20205811.
8
Evolving Spiking Neural Networks for online learning over drifting data streams.进化 Spike 神经网络,用于漂移数据流的在线学习。
Neural Netw. 2018 Dec;108:1-19. doi: 10.1016/j.neunet.2018.07.014. Epub 2018 Aug 2.
9
Adaptive Chunk-Based Dynamic Weighted Majority for Imbalanced Data Streams With Concept Drift.用于处理带有概念漂移的不平衡数据流的基于自适应块的动态加权多数算法
IEEE Trans Neural Netw Learn Syst. 2020 Aug;31(8):2764-2778. doi: 10.1109/TNNLS.2019.2951814. Epub 2019 Dec 5.
10
Online Active Learning Ensemble Framework for Drifted Data Streams.用于漂移数据流的在线主动学习集成框架
IEEE Trans Neural Netw Learn Syst. 2019 Feb;30(2):486-498. doi: 10.1109/TNNLS.2018.2844332. Epub 2018 Jul 2.