Cai Weiling, Chen Songcan, Zhang Daoqiang
Department of Computer Science and Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, China.
IEEE Trans Neural Netw. 2010 Feb;21(2):185-200. doi: 10.1109/TNN.2009.2034741. Epub 2009 Dec 18.
Traditional pattern recognition involves two tasks: clustering learning and classification learning. Clustering result can enhance the generalization ability of classification learning, while the class information can improve the accuracy of clustering learning. Hence, both learning methods can complement each other. To fuse the advantages of both learning methods together, many existing algorithms have been developed in a sequential fusing way by first optimizing the clustering criterion and then the classification criterion associated with the obtained clustering results. However, such kind of algorithms naturally fails to achieve the simultaneous optimality for two criteria, and thus has to sacrifice either the clustering performance or the classification performance. To overcome that problem, in this paper, we present a multiobjective simultaneous learning framework (MSCC) for both clustering and classification learning. MSCC utilizes multiple objective functions to formulate the clustering and classification problems, respectively, and more importantly, it employs the Bayesian theory to make these functions all only dependent on a set of the same parameters, i.e., clustering centers which play a role of the bridge connecting the clustering and classification learning. By simultaneously optimizing the clustering centers embedded in these functions, not only the effective clustering performance but also the promising classification performance can be simultaneously attained. Furthermore, from the multiple Pareto-optimality solutions obtained in MSCC, we can get an interesting observation that there is complementarity to great extent between clustering and classification learning processes. Empirical results on both synthetic and real data sets demonstrate the effectiveness and potential of MSCC.
聚类学习和分类学习。聚类结果可以增强分类学习的泛化能力,而类别信息可以提高聚类学习的准确性。因此,这两种学习方法可以相互补充。为了将这两种学习方法的优点融合在一起,许多现有算法都是以顺序融合的方式开发的,即先优化聚类准则,然后再优化与所获得的聚类结果相关的分类准则。然而,这类算法自然无法实现两个准则的同时最优,因此不得不牺牲聚类性能或分类性能。为了克服这个问题,在本文中,我们提出了一种用于聚类和分类学习的多目标同时学习框架(MSCC)。MSCC分别利用多个目标函数来构建聚类和分类问题,更重要的是,它采用贝叶斯理论使这些函数都仅依赖于一组相同的参数,即聚类中心,聚类中心起到连接聚类和分类学习的桥梁作用。通过同时优化嵌入在这些函数中的聚类中心,不仅可以同时获得有效的聚类性能,还可以获得良好的分类性能。此外,从MSCC中获得的多个帕累托最优解中,我们可以得到一个有趣的发现,即聚类和分类学习过程在很大程度上存在互补性。在合成数据集和真实数据集上的实证结果证明了MSCC的有效性和潜力。