Key Laboratory of Electromagnetic Wave Information Technology and Metrology of Zhejiang Province, College of Information Engineering, China Jiliang University, Hangzhou, China.
College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China.
BMC Bioinformatics. 2019 Dec 24;20(Suppl 25):681. doi: 10.1186/s12859-019-3255-x.
Cost-sensitive algorithm is an effective strategy to solve imbalanced classification problem. However, the misclassification costs are usually determined empirically based on user expertise, which leads to unstable performance of cost-sensitive classification. Therefore, an efficient and accurate method is needed to calculate the optimal cost weights.
In this paper, two approaches are proposed to search for the optimal cost weights, targeting at the highest weighted classification accuracy (WCA). One is the optimal cost weights grid searching and the other is the function fitting. Comparisons are made between these between the two algorithms above. In experiments, we classify imbalanced gene expression data using extreme learning machine to test the cost weights obtained by the two approaches.
Comprehensive experimental results show that the function fitting method is generally more efficient, which can well find the optimal cost weights with acceptable WCA.
代价敏感算法是解决不平衡分类问题的有效策略。然而,误分类代价通常是根据用户经验进行经验性确定的,这导致代价敏感分类的性能不稳定。因此,需要一种高效准确的方法来计算最优的代价权重。
本文提出了两种方法来搜索最优代价权重,旨在获得最高加权分类准确率(WCA)。一种是最优代价权重网格搜索,另一种是函数拟合。在实验中,我们使用极限学习机对不平衡基因表达数据进行分类,以测试这两种算法得到的代价权重。
综合实验结果表明,函数拟合方法通常更有效,它可以很好地找到最优代价权重,同时保持可接受的 WCA。