IEEE Trans Cybern. 2017 Oct;47(10):3266-3279. doi: 10.1109/TCYB.2017.2707463. Epub 2017 Jun 5.
One-class classification (OCC) models a set of target data from one class to detect outliers. OCC approaches like one-class support vector machine (OCSVM) and support vector data description (SVDD) have wide practical applications. Recently, one-class extreme learning machine (OCELM), which inherits the fast learning speed of original ELM and achieves equivalent or higher data description performance than OCSVM and SVDD, is proposed as a promising alternative. However, OCELM faces the same thorny parameter selection problem as OCSVM and SVDD. It significantly affects the performance of OCELM and remains under-explored. This paper proposes minimal spanning tree (MST)-GEN, an automatic way to select proper parameters for OCELM. Specifically, we first build a n -round MST to model the structure and distribution of the given target set. With information from n -round MST, a controllable number of pseudo outliers are generated by edge pattern detection and a novel "repelling" process, which readily overcomes two fundamental problems in previous outlier generation methods: where and how many pseudo outliers should be generated. Unlike previous methods that only generate pseudo outliers, we further exploit n -round MST to generate pseudo target data, so as to avoid the time-consuming cross-validation process and accelerate the parameter selection. Extensive experiments on various datasets suggest that the proposed method can select parameters for OCELM in a highly efficient and accurate manner when compared with existing methods, which enables OCELM to achieve better OCC performance in OCC applications. Furthermore, our experiments show that MST-GEN can also be favorably applied to other prevalent OCC methods like OCSVM and SVDD.
一类分类 (OCC) 从一类目标数据中建模以检测异常值。一类支持向量机 (OCSVM) 和支持向量数据描述 (SVDD) 等 OCC 方法具有广泛的实际应用。最近,提出了一类极限学习机 (OCELM),它继承了原始 ELM 的快速学习速度,并实现了与 OCSVM 和 SVDD 相当或更高的数据描述性能,是一种很有前途的替代方法。然而,OCELM 面临与 OCSVM 和 SVDD 相同的棘手参数选择问题。它极大地影响了 OCELM 的性能,并且仍然没有得到充分的探索。本文提出了最小生成树 (MST)-GEN,这是一种为 OCELM 选择合适参数的自动方法。具体来说,我们首先构建一个 n 轮 MST 来建模给定目标集的结构和分布。通过 n 轮 MST 的信息,通过边缘模式检测和新颖的“排斥”过程生成可控制数量的伪异常值,这很容易克服了以前的异常值生成方法中的两个基本问题:应该在哪里以及生成多少个伪异常值。与仅生成伪异常值的先前方法不同,我们进一步利用 n 轮 MST 生成伪目标数据,以避免耗时的交叉验证过程并加速参数选择。在各种数据集上的广泛实验表明,与现有方法相比,所提出的方法可以以高效和准确的方式为 OCELM 选择参数,从而使 OCELM 在 OCC 应用中实现更好的 OCC 性能。此外,我们的实验表明,MST-GEN 也可以有利地应用于其他流行的 OCC 方法,如 OCSVM 和 SVDD。