Xu Yuhong, Yu Zhiwen, Chen C L Philip
IEEE Trans Neural Netw Learn Syst. 2022 Jun 3;PP. doi: 10.1109/TNNLS.2022.3177695.
High-dimensional class imbalanced data have plagued the performance of classification algorithms seriously. Because of a large number of redundant/invalid features and the class imbalanced issue, it is difficult to construct an optimal classifier for high-dimensional imbalanced data. Classifier ensemble has attracted intensive attention since it can achieve better performance than an individual classifier. In this work, we propose a multiview optimization (MVO) to learn more effective and robust features from high-dimensional imbalanced data, based on which an accurate and robust ensemble system is designed. Specifically, an optimized subview generation (OSG) in MVO is first proposed to generate multiple optimized subviews from different scenarios, which can strengthen the classification ability of features and increase the diversity of ensemble members simultaneously. Second, a new evaluation criterion that considers the distribution of data in each optimized subview is developed based on which a selective ensemble of optimized subviews (SEOS) is designed to perform the subview selective ensemble. Finally, an oversampling approach is executed on the optimized view to obtain a new class rebalanced subset for the classifier. Experimental results on 25 high-dimensional class imbalanced datasets indicate that the proposed method outperforms other mainstream classifier ensemble methods.
高维类不平衡数据严重困扰着分类算法的性能。由于存在大量冗余/无效特征以及类不平衡问题,为高维不平衡数据构建最优分类器十分困难。分类器集成因其能比单个分类器取得更好的性能而备受关注。在这项工作中,我们提出一种多视图优化(MVO)方法,用于从高维不平衡数据中学习更有效、更鲁棒的特征,并在此基础上设计一个准确且鲁棒的集成系统。具体而言,MVO中首先提出一种优化子视图生成(OSG)方法,从不同场景生成多个优化子视图,这既能增强特征的分类能力,又能同时增加集成成员的多样性。其次,基于各优化子视图中数据的分布情况开发了一种新的评估准则,并据此设计了一个优化子视图选择性集成(SEOS)来执行子视图的选择性集成。最后,对优化视图执行过采样方法,为分类器获取一个新的类重新平衡子集。在25个高维类不平衡数据集上的实验结果表明,所提方法优于其他主流分类器集成方法。