IEEE Trans Neural Netw Learn Syst. 2020 Dec;31(12):5192-5203. doi: 10.1109/TNNLS.2020.2964737. Epub 2020 Nov 30.
Oblique random forests (ObRFs) have attracted increasing attention recently. Their popularity is mainly driven by learning oblique hyperplanes instead of expensively searching for axis-aligned hyperplanes in the standard random forest. However, most existing methods are trained in an off-line mode, which assumes that the training data are given as a batch. Efficient dual-incremental learning (DIL) strategies for ObRF have rarely been explored when new inputs from the existing classes or unseen classes come. The goal of this article is to provide an ObRF with DIL capacity to perform classification on-the-fly. First, we propose a batch multiclass ObRF (ObRF-BM) algorithm by using a broad learning system and a multi-to-binary method to obtain an optimal oblique hyperplane in a higher dimensional space and then separate the samples into two supervised clusters at each node, which provides the basis for the following incremental learning strategy. Then, the DIL strategy for ObRF-BM, termed ObRF-DIL, is developed by analytically updating the parameters of all nodes on the classification route of the increment of input samples and the increment of input classes so that the ObRF-BM model can be effectively updated without laborious retraining from scratch. Experimental results using several public data sets demonstrate the superiority of the proposed approach in comparison with several state-of-the-art methods.
倾斜随机森林(ObRF)最近受到了越来越多的关注。它们的流行主要是由于学习倾斜超平面,而不是在标准随机森林中昂贵地搜索轴对齐超平面。然而,大多数现有方法都是在离线模式下训练的,这假设训练数据是作为一批提供的。当现有类或未见类的新输入出现时,ObRF 的高效双增量学习(DIL)策略很少被探索。本文的目的是提供一种具有 DIL 能力的 ObRF,以便实时进行分类。首先,我们通过使用广泛的学习系统和多对二进制方法提出了一种批量多类 ObRF(ObRF-BM)算法,以在更高维空间中获得最佳的倾斜超平面,然后在每个节点将样本分成两个监督聚类,这为以下增量学习策略提供了基础。然后,通过在输入样本的增量和输入类的增量的分类路径上分析性地更新所有节点的参数,开发了 ObRF-BM 的 DIL 策略,称为 ObRF-DIL,以便有效地更新 ObRF-BM 模型,而无需从头开始进行费力的重新训练。使用几个公共数据集的实验结果表明,与几种最先进的方法相比,所提出的方法具有优越性。