Suppr超能文献

用于大规模数据流建模的二维分区在线序贯广义学习系统

Bidimensionally partitioned online sequential broad learning system for large-scale data stream modeling.

作者信息

Guo Wei, Yu Jianjiang, Zhou Caigen, Yuan Xiaofeng, Wang Zhanxiu

机构信息

Jiangsu Provincial University Key Lab of Child Cognitive Development and Mental Health, Yancheng Teachers University, Yancheng, 224002, China.

College of Information Engineering, Yancheng Teachers University, Yancheng, 224002, China.

出版信息

Sci Rep. 2024 Dec 30;14(1):32009. doi: 10.1038/s41598-024-83563-5.

Abstract

Incremental broad learning system (IBLS) is an effective and efficient incremental learning method based on broad learning paradigm. Owing to its streamlined network architecture and flexible dynamic update scheme, IBLS can achieve rapid incremental reconstruction on the basis of the previous model without the entire retraining from scratch, which enables it adept at handling streaming data. However, two prominent deficiencies still persist in IBLS and constrain its further promotion in large-scale data stream scenarios. Firstly, IBLS needs to retain all historical data and perform associated calculations in the incremental learning process, which causes its computational overhead and storage burden to increase over time and as such puts the efficacy of the algorithm at risk for massive or unlimited data streams. Additionally, due to the random generation rule of hidden nodes, IBLS generally necessitates a large network size to guarantee approximation accuracy, and the resulting high-dimensional matrix calculation poses a greater challenge to the updating efficiency of the model. To address these issues, we propose a novel bidimensionally partitioned online sequential broad learning system (BPOSBLS) in this paper. The core idea of BPOSBLS is to partition the high-dimensional broad feature matrix bidimensionally from the aspects of instance dimension and feature dimension, and consequently decompose a large least squares problem into multiple smaller ones, which can then be solved individually. By doing so, the scale and computational complexity of the original high-order model are substantially diminished, thus significantly improving its learning efficiency and usability for large-scale complex learning tasks. Meanwhile, an ingenious recursive computation method called partitioned recursive least squares is devised to solve the BPOSBLS. This method exclusively utilizes the current online samples for iterative updating, while disregarding the previous historical samples, thereby rendering BPOSBLS a lightweight online sequential learning algorithm with consistently low computational costs and storage requirements. Theoretical analyses and simulation experiments demonstrate the effectiveness and superiority of the proposed algorithm.

摘要

增量式广义学习系统(IBLS)是一种基于广义学习范式的高效增量学习方法。由于其简化的网络架构和灵活的动态更新方案,IBLS能够在先前模型的基础上实现快速增量重建,而无需从头进行整个重新训练,这使其擅长处理流式数据。然而,IBLS仍然存在两个突出的缺陷,限制了它在大规模数据流场景中的进一步推广。首先,IBLS在增量学习过程中需要保留所有历史数据并进行相关计算,这导致其计算开销和存储负担随着时间的推移而增加,从而使算法在面对大量或无限数据流时的有效性面临风险。此外,由于隐藏节点的随机生成规则,IBLS通常需要较大的网络规模来保证逼近精度,而由此产生的高维矩阵计算对模型的更新效率构成了更大的挑战。为了解决这些问题,我们在本文中提出了一种新颖的二维划分在线序贯广义学习系统(BPOSBLS)。BPOSBLS的核心思想是从实例维度和特征维度两个方面对高维广义特征矩阵进行二维划分,从而将一个大的最小二乘问题分解为多个较小的问题,然后可以分别求解。通过这样做,原始高阶模型的规模和计算复杂度大幅降低,从而显著提高其学习效率和对大规模复杂学习任务的适用性。同时,设计了一种巧妙的递归计算方法——划分递归最小二乘法来求解BPOSBLS。该方法仅利用当前的在线样本进行迭代更新,而忽略先前的历史样本,从而使BPOSBLS成为一种计算成本和存储需求始终较低的轻量级在线序贯学习算法。理论分析和仿真实验证明了所提算法的有效性和优越性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7174/11685709/bfad5dfeccce/41598_2024_83563_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验