Liu Xin, He Wenqing
Department of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, People's Republic of China.
Department of Statistical and Actuarial Sciences, University of Western Ontario, London, ON, Canada.
J Appl Stat. 2021 Jan 8;49(6):1465-1484. doi: 10.1080/02664763.2020.1870669. eCollection 2022.
The support vector machine (SVM) is a popularly used classifier in applications such as pattern recognition, texture mining and image retrieval owing to its flexibility and interpretability. However, its performance deteriorates when the response classes are imbalanced. To enhance the performance of the support vector machine classifier in the imbalanced cases we investigate a new two stage method by adaptively scaling the kernel function. Based on the information obtained from the standard SVM in the first stage, we conformally rescale the kernel function in a data adaptive fashion in the second stage so that the separation between two classes can be effectively enlarged with incorporation of observation imbalance. The proposed method takes into account the location of the support vectors in the feature space, therefore is especially appealing when the response classes are imbalanced. The resulting algorithm can efficiently improve the classification accuracy, which is confirmed by intensive numerical studies as well as a real prostate cancer imaging data application.
支持向量机(SVM)因其灵活性和可解释性,在模式识别、纹理挖掘和图像检索等应用中是一种广泛使用的分类器。然而,当响应类别不平衡时,其性能会下降。为了提高支持向量机分类器在不平衡情况下的性能,我们研究了一种通过自适应缩放核函数的新的两阶段方法。基于第一阶段从标准支持向量机获得的信息,我们在第二阶段以数据自适应的方式对核函数进行共形缩放,以便在考虑观测不平衡的情况下有效地扩大两类之间的分离。所提出的方法考虑了特征空间中支持向量的位置,因此在响应类别不平衡时特别有吸引力。大量的数值研究以及实际前列腺癌成像数据应用证实,所得算法能够有效地提高分类准确率。