Du Jie, Liu Peng, Vong Chi-Man, Chen Chuangquan, Wang Tianfu, Chen C L Philip
IEEE Trans Neural Netw Learn Syst. 2024 Aug;35(8):11332-11345. doi: 10.1109/TNNLS.2023.3259016. Epub 2024 Aug 5.
Machine learning aims to generate a predictive model from a training dataset of a fixed number of known classes. However, many real-world applications (such as health monitoring and elderly care) are data streams in which new data arrive continually in a short time. Such new data may even belong to previously unknown classes. Hence, class-incremental learning (CIL) is necessary, which incrementally and rapidly updates an existing model with the data of new classes while retaining the existing knowledge of old classes. However, most current CIL methods are designed based on deep models that require a computationally expensive training and update process. In addition, deep learning based CIL (DCIL) methods typically employ stochastic gradient descent (SGD) as an optimizer that forgets the old knowledge to a certain extent. In this article, a broad learning system-based CIL (BLS-CIL) method with fast update and high retainability of old class knowledge is proposed. Traditional BLS is a fast and effective shallow neural network, but it does not work well on CIL tasks. However, our proposed BLS-CIL can overcome these issues and provide the following: 1) high accuracy due to our novel class-correlation loss function that considers the correlations between old and new classes; 2) significantly short training/update time due to the newly derived closed-form solution for our class-correlation loss without iterative optimization; and 3) high retainability of old class knowledge due to our newly derived recursive update rule for CIL (RULL) that does not replay the exemplars of all old classes, as contrasted to the exemplars-replaying methods with the SGD optimizer. The proposed BLS-CIL has been evaluated over 12 real-world datasets, including seven tabular/numerical datasets and six image datasets, and the compared methods include one shallow network and seven classical or state-of-the-art DCIL methods. Experimental results show that our BIL-CIL can significantly improve the classification performance over a shallow network by a large margin (8.80%-48.42%). It also achieves comparable or even higher accuracy than DCIL methods, but greatly reduces the training time from hours to minutes and the update time from minutes to seconds.
机器学习旨在从固定数量已知类别的训练数据集中生成预测模型。然而,许多现实世界的应用(如健康监测和老年护理)是数据流,新数据在短时间内不断到达。此类新数据甚至可能属于先前未知的类别。因此,类增量学习(CIL)是必要的,它能利用新类别的数据对现有模型进行增量且快速的更新,同时保留旧类别的现有知识。然而,当前大多数CIL方法是基于深度模型设计的,这些模型需要计算成本高昂的训练和更新过程。此外,基于深度学习的CIL(DCIL)方法通常采用随机梯度下降(SGD)作为优化器,这在一定程度上会遗忘旧知识。在本文中,提出了一种基于广泛学习系统的CIL(BLS-CIL)方法,该方法具有快速更新和对旧类知识的高保留性。传统的BLS是一种快速有效的浅层神经网络,但在CIL任务上表现不佳。然而,我们提出的BLS-CIL可以克服这些问题,并具有以下优点:1)由于我们新颖的类相关损失函数考虑了新旧类之间的相关性,因此具有高精度;2)由于我们为类相关损失新推导的闭式解无需迭代优化,因此训练/更新时间显著缩短;3)由于我们为CIL新推导的递归更新规则(RULL),与使用SGD优化器的样本重放方法不同,它不会重放所有旧类别的样本,因此对旧类知识具有高保留性。所提出的BLS-CIL已在12个真实世界数据集上进行了评估,包括7个表格/数值数据集和6个图像数据集,比较的方法包括一个浅层网络和7种经典或最新的DCIL方法。实验结果表明,我们的BIL-CIL与浅层网络相比,可以显著提高分类性能(提高幅度为8.80%-48.42%)。它还实现了与DCIL方法相当甚至更高的准确率,但将训练时间从数小时大幅缩短至数分钟,更新时间从数分钟缩短至数秒。