IEEE Trans Cybern. 2018 Jan;48(1):357-370. doi: 10.1109/TCYB.2016.2636370. Epub 2016 Dec 22.
A challenging problem in object recognition is to train a robust classifier with small and imbalanced data set. In such cases, the learned classifier tends to overfit the training data and has low prediction accuracy on the minority class. In this paper, we address the problem of class imbalanced object recognition by combining synthetic minorities over-sampling technique (SMOTE) and instance-based transfer boosting to rebalance the skewed class distribution. We present ways of generating synthetic instances under the learning framework of transfer Adaboost. A novel weighted SMOTE technique (WSMOTE) is proposed to generate weighted synthetic instances with weighted source and target instances at each boosting round. Based on WSMOTE, we propose a novel class imbalanced transfer boosting algorithm called WSMOTE-TrAdaboost and experimentally demonstrate its effectiveness on four datasets (Office, Caltech256, SUN2012, and VOC2012) for object recognition application. Bag-of-words model with SURF features and histogram of oriented gradient features are separately used to represent an image. We experimentally demonstrated the effectiveness and robustness of our approaches by comparing it with several baseline algorithms in boosting family for class imbalanced learning.
对象识别中的一个挑战性问题是使用小而不平衡的数据集训练稳健的分类器。在这种情况下,学习到的分类器往往会过度拟合训练数据,并且对少数类别的预测准确性较低。在本文中,我们通过结合合成少数类过采样技术(SMOTE)和基于实例的转移提升来解决类不平衡对象识别问题,以重新平衡倾斜的类分布。我们提出了在转移 Adaboost 学习框架下生成合成实例的方法。提出了一种新的加权 SMOTE 技术(WSMOTE),在每个提升轮次中使用加权源和目标实例生成加权合成实例。基于 WSMOTE,我们提出了一种新的类不平衡转移提升算法,称为 WSMOTE-TrAdaboost,并在四个数据集(Office、Caltech256、SUN2012 和 VOC2012)上进行了对象识别应用的实验,证明了其有效性。使用 SURF 特征和方向梯度直方图特征的词袋模型分别用于表示图像。我们通过与提升家族中的几个基线算法进行比较,在类不平衡学习方面,实验证明了我们方法的有效性和鲁棒性。