• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

动态稀疏子空间聚类在高维数据流中的应用。

Dynamic Sparse Subspace Clustering for Evolving High-Dimensional Data Streams.

出版信息

IEEE Trans Cybern. 2022 Jun;52(6):4173-4186. doi: 10.1109/TCYB.2020.3023973. Epub 2022 Jun 16.

DOI:10.1109/TCYB.2020.3023973
PMID:33232249
Abstract

In an era of ubiquitous large-scale evolving data streams, data stream clustering (DSC) has received lots of attention because the scale of the data streams far exceeds the ability of expert human analysts. It has been observed that high-dimensional data are usually distributed in a union of low-dimensional subspaces. In this article, we propose a novel sparse representation-based DSC algorithm, called evolutionary dynamic sparse subspace clustering (EDSSC). It can cope with the time-varying nature of subspaces underlying the evolving data streams, such as subspace emergence, disappearance, and recurrence. The proposed EDSSC consists of two phases: 1) static learning and 2) online clustering. During the first phase, a data structure for storing the statistic summary of data streams, called EDSSC summary, is proposed which can better address the dilemma between the two conflicting goals: 1) saving more points for accuracy of subspace clustering (SC) and 2) discarding more points for the efficiency of DSC. By further proposing an algorithm to estimate the subspace number, the proposed EDSSC does not need to know the number of subspaces. In the second phase, a more suitable index, called the average sparsity concentration index (ASCI), is proposed, which dramatically promotes the clustering accuracy compared to the conventionally utilized SCI index. In addition, the subspace evolution detection model based on the Page-Hinkley test is proposed where the appearing, disappearing, and recurring subspaces can be detected and adapted. Extinct experiments on real-world data streams show that the EDSSC outperforms the state-of-the-art online SC approaches.

摘要

在大规模数据流无处不在的时代,由于数据流的规模远远超出了专家分析人员的能力,因此数据流聚类(DSC)受到了广泛关注。已经观察到高维数据通常分布在低维子空间的并集中。在本文中,我们提出了一种新颖的基于稀疏表示的 DSC 算法,称为进化动态稀疏子空间聚类(EDSSC)。它可以处理基础数据流随时间变化的子空间的性质,例如子空间的出现、消失和重现。所提出的 EDSSC 由两个阶段组成:1)静态学习和 2)在线聚类。在第一阶段,提出了一种用于存储数据流统计摘要的数据结构,称为 EDSSC 摘要,它可以更好地解决两个相互冲突的目标之间的困境:1)为子空间聚类(SC)的准确性保存更多点,2)为 DSC 的效率丢弃更多点。通过进一步提出一种估计子空间数的算法,所提出的 EDSSC 不需要知道子空间的数量。在第二阶段,提出了一种更合适的指标,称为平均稀疏度集中指数(ASCI),与传统使用的 SCI 指数相比,它大大提高了聚类精度。此外,还提出了基于 Page-Hinkley 检验的子空间演化检测模型,其中可以检测和适应出现、消失和重现的子空间。在真实数据流上的灭绝实验表明,EDSSC 优于最新的在线 SC 方法。

相似文献

1
Dynamic Sparse Subspace Clustering for Evolving High-Dimensional Data Streams.动态稀疏子空间聚类在高维数据流中的应用。
IEEE Trans Cybern. 2022 Jun;52(6):4173-4186. doi: 10.1109/TCYB.2020.3023973. Epub 2022 Jun 16.
2
Sparse subspace clustering: algorithm, theory, and applications.稀疏子空间聚类:算法、理论与应用。
IEEE Trans Pattern Anal Mach Intell. 2013 Nov;35(11):2765-81. doi: 10.1109/TPAMI.2013.57.
3
Robust auto-weighted multi-view subspace clustering with common subspace representation matrix.具有公共子空间表示矩阵的鲁棒自加权多视图子空间聚类
PLoS One. 2017 May 23;12(5):e0176769. doi: 10.1371/journal.pone.0176769. eCollection 2017.
4
Online Sparse Representation Clustering for Evolving Data Streams.
IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):525-539. doi: 10.1109/TNNLS.2023.3325556. Epub 2025 Jan 7.
5
An Online Semantic-Enhanced Graphical Model for Evolving Short Text Stream Clustering.在线语义增强图模型用于演化短文本流聚类。
IEEE Trans Cybern. 2022 Dec;52(12):13809-13820. doi: 10.1109/TCYB.2021.3108897. Epub 2022 Nov 18.
6
Structured Sparse Subspace Clustering: A Joint Affinity Learning and Subspace Clustering Framework.结构化稀疏子空间聚类:一种联合亲和学习和子空间聚类框架。
IEEE Trans Image Process. 2017 Jun;26(6):2988-3001. doi: 10.1109/TIP.2017.2691557. Epub 2017 Apr 6.
7
Sparse subspace clustering for data with missing entries and high-rank matrix completion.用于处理带有缺失值的数据的稀疏子空间聚类及高秩矩阵补全
Neural Netw. 2017 Sep;93:36-44. doi: 10.1016/j.neunet.2017.04.005. Epub 2017 Apr 25.
8
Self-organizing subspace clustering for high-dimensional and multi-view data.基于子空间聚类的高维多视图数据方法研究。
Neural Netw. 2020 Oct;130:253-268. doi: 10.1016/j.neunet.2020.06.022. Epub 2020 Jul 3.
9
Robust Elastic-Net Subspace Representation.稳健的弹性网络子空间表示
IEEE Trans Image Process. 2016 Sep;25(9):4245-4259. doi: 10.1109/TIP.2016.2588321. Epub 2016 Jul 7.
10
Exploiting Unsupervised and Supervised Constraints for Subspace Clustering.利用无监督和有监督约束进行子空间聚类。
IEEE Trans Pattern Anal Mach Intell. 2015 Aug;37(8):1542-57. doi: 10.1109/TPAMI.2014.2377740.

引用本文的文献

1
Hybrid lion and exponential PSO-based metaheuristic clustering approach for efficient dynamic data stream management.基于混合狮子和指数粒子群优化的元启发式聚类方法用于高效动态数据流管理
Sci Rep. 2025 Jul 1;15(1):22343. doi: 10.1038/s41598-025-07404-9.
2
Comprehensive analysis of clustering algorithms: exploring limitations and innovative solutions.聚类算法的综合分析:探索局限性与创新解决方案。
PeerJ Comput Sci. 2024 Aug 29;10:e2286. doi: 10.7717/peerj-cs.2286. eCollection 2024.