• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于熵的不平衡数据流概念漂移动态集成分类算法

Entropy-based dynamic ensemble classication algorithm for imbalanced data stream with concept drift.

作者信息

Gong JiaMing, Dong MingGang

机构信息

College of Data Science, Guangzhou Huashang College, Guangzhou, Guangdong, China.

St. Paul University Philippines, Province of Cagayan, Tuguegarao City, Philippines.

出版信息

PLoS One. 2024 Dec 13;19(12):e0311133. doi: 10.1371/journal.pone.0311133. eCollection 2024.

DOI:10.1371/journal.pone.0311133
PMID:39671400
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11643253/
Abstract

Online imbalanced learning is an emerging topic that combines the challenges of class imbalance and concept drift. However, current works account for issues of class imbalance and concept drift. And only few works have considered these issues simultaneously. To this end, this paper proposes an entropy-based dynamic ensemble classification algorithm (EDAC) to consider data streams with class imbalance and concept drift simultaneously. First, to address the problem of imbalanced learning in training data chunks arriving at different times, EDAC adopts an entropy-based balanced strategy. It divides the data chunks into multiple balanced sample pairs based on the differences in the information entropy between classes in the sample data chunk. Additionally, we propose a density-based sampling method to improve the accuracy of classifying minority class samples into high quality samples and common samples via the density of similar samples. In this manner high quality and common samples are randomly selected for training the classifier. Finally, to solve the issue of concept drift, EDAC designs and implements an ensemble classifier that uses a self-feedback strategy to determine the initial weight of the classifier by adjusting the weight of the sub-classifier according to the performance on the arrived data chunks. The experimental results demonstrate that EDAC outperforms five state-of-the-art algorithms considering four synthetic and one real-world data streams.

摘要

在线不平衡学习是一个新兴的主题,它结合了类不平衡和概念漂移的挑战。然而,当前的工作只考虑了类不平衡和概念漂移的问题,只有少数工作同时考虑了这些问题。为此,本文提出了一种基于熵的动态集成分类算法(EDAC),以同时考虑具有类不平衡和概念漂移的数据流。首先,为了解决在不同时间到达的训练数据块中的不平衡学习问题,EDAC采用了一种基于熵的平衡策略。它根据样本数据块中类之间信息熵的差异,将数据块划分为多个平衡样本对。此外,我们提出了一种基于密度的采样方法,通过相似样本的密度将少数类样本分类为高质量样本和普通样本,以提高分类精度。通过这种方式,随机选择高质量和普通样本用于训练分类器。最后,为了解决概念漂移问题,EDAC设计并实现了一个集成分类器,该分类器使用自反馈策略,通过根据到达的数据块上的性能调整子分类器的权重来确定分类器的初始权重。实验结果表明,在考虑四个合成和一个真实世界数据流的情况下,EDAC优于五种最先进的算法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/459e8b140bf1/pone.0311133.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/3a67dda5e26b/pone.0311133.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/da5df32ce002/pone.0311133.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/44f8914feb29/pone.0311133.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/4419bd61bfb7/pone.0311133.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/84bf33aaf999/pone.0311133.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/75bf980ac3ec/pone.0311133.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/778c241e4e55/pone.0311133.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/5bdf5655d515/pone.0311133.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/ba6ffbdb92fe/pone.0311133.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/4163303864eb/pone.0311133.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/14c1173a0fc8/pone.0311133.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/459e8b140bf1/pone.0311133.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/3a67dda5e26b/pone.0311133.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/da5df32ce002/pone.0311133.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/44f8914feb29/pone.0311133.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/4419bd61bfb7/pone.0311133.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/84bf33aaf999/pone.0311133.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/75bf980ac3ec/pone.0311133.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/778c241e4e55/pone.0311133.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/5bdf5655d515/pone.0311133.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/ba6ffbdb92fe/pone.0311133.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/4163303864eb/pone.0311133.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/14c1173a0fc8/pone.0311133.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96ff/11643253/459e8b140bf1/pone.0311133.g012.jpg

相似文献

1
Entropy-based dynamic ensemble classication algorithm for imbalanced data stream with concept drift.基于熵的不平衡数据流概念漂移动态集成分类算法
PLoS One. 2024 Dec 13;19(12):e0311133. doi: 10.1371/journal.pone.0311133. eCollection 2024.
2
Adaptive Chunk-Based Dynamic Weighted Majority for Imbalanced Data Streams With Concept Drift.用于处理带有概念漂移的不平衡数据流的基于自适应块的动态加权多数算法
IEEE Trans Neural Netw Learn Syst. 2020 Aug;31(8):2764-2778. doi: 10.1109/TNNLS.2019.2951814. Epub 2019 Dec 5.
3
Dynamic Ensemble Selection for Imbalanced Data Streams With Concept Drift.具有概念漂移的不平衡数据流的动态集成选择
IEEE Trans Neural Netw Learn Syst. 2022 Jun 22;PP. doi: 10.1109/TNNLS.2022.3183120.
4
An ensemble learning method with GAN-based sampling and consistency check for anomaly detection of imbalanced data streams with concept drift.一种基于生成对抗网络(GAN)采样和一致性检查的集成学习方法,用于具有概念漂移的不平衡数据流的异常检测。
PLoS One. 2024 Jan 26;19(1):e0292140. doi: 10.1371/journal.pone.0292140. eCollection 2024.
5
A dynamic ensemble framework for mining textual streams with class imbalance.一种用于挖掘具有类别不平衡的文本流的动态集成框架。
ScientificWorldJournal. 2014;2014:497354. doi: 10.1155/2014/497354. Epub 2014 Apr 10.
6
Graph ensemble boosting for imbalanced noisy graph stream classification.基于图集成提升的不平衡噪声图流分类。
IEEE Trans Cybern. 2015 May;45(5):940-54. doi: 10.1109/TCYB.2014.2341031. Epub 2014 Aug 27.
7
Cost-Sensitive Classification for Evolving Data Streams with Concept Drift and Class Imbalance.具有概念漂移和类不平衡的演进数据流的代价敏感分类。
Comput Intell Neurosci. 2021 Aug 2;2021:8813806. doi: 10.1155/2021/8813806. eCollection 2021.
8
Hybrid Classifier Ensemble for Imbalanced Data.混合分类器集成用于不平衡数据。
IEEE Trans Neural Netw Learn Syst. 2020 Apr;31(4):1387-1400. doi: 10.1109/TNNLS.2019.2920246. Epub 2019 Jun 28.
9
Classifying Imbalanced Data Streams via Dynamic Feature Group Weighting with Importance Sampling.基于重要性采样的动态特征组加权对不平衡数据流进行分类
Proc SIAM Int Conf Data Min. 2014 Apr;2014:722-730. doi: 10.1137/1.9781611973440.83.
10
Meta-cognitive online sequential extreme learning machine for imbalanced and concept-drifting data classification.用于不平衡和概念漂移数据分类的元认知在线序列极限学习机
Neural Netw. 2016 Aug;80:79-94. doi: 10.1016/j.neunet.2016.04.008. Epub 2016 Apr 28.

本文引用的文献

1
Classifying Imbalanced Data Streams via Dynamic Feature Group Weighting with Importance Sampling.基于重要性采样的动态特征组加权对不平衡数据流进行分类
Proc SIAM Int Conf Data Min. 2014 Apr;2014:722-730. doi: 10.1137/1.9781611973440.83.
2
Incremental learning of concept drift in nonstationary environments.非平稳环境中概念漂移的增量学习
IEEE Trans Neural Netw. 2011 Oct;22(10):1517-31. doi: 10.1109/TNN.2011.2160459. Epub 2011 Aug 4.