Civil Engineering Department, University of Burgos, Spain.
Int J Neural Syst. 2011 Dec;21(6):505-25. doi: 10.1142/S0129065711003012.
This paper presents a novel model for performing classification and visualization of high-dimensional data by means of combining two enhancing techniques. The first is a semi-supervised learning, an extension of the supervised learning used to incorporate unlabeled information to the learning process. The second is an ensemble learning to replicate the analysis performed, followed by a fusion mechanism that yields as a combined result of previously performed analysis in order to improve the result of a single model. The proposed learning schema, termed S(2)-Ensemble, is applied to several unsupervised learning algorithms within the family of topology maps, such as the Self-Organizing Maps and the Neural Gas. This study also includes a thorough research of the characteristics of these novel schemes, by means quality measures, which allow a complete analysis of the resultant classifiers from the viewpoint of various perspectives over the different ways that these classifiers are used. The study conducts empirical evaluations and comparisons on various real-world datasets from the UCI repository, which exhibit different characteristics, so to enable an extensive selection of situations where the presented new algorithms can be applied.
本文提出了一种新的模型,通过结合两种增强技术来对高维数据进行分类和可视化。第一种是半监督学习,是对监督学习的扩展,用于将未标记的信息纳入学习过程中。第二种是集成学习,用于复制所执行的分析,然后是融合机制,将之前执行的分析的结果组合起来,以提高单个模型的结果。所提出的学习方案称为 S(2)-Ensemble,应用于拓扑图族中的几个无监督学习算法,如自组织映射和神经气体。本研究还通过质量度量研究了这些新方案的特性,这些质量度量允许从不同角度对分类器的结果进行全面分析,这些角度涉及分类器的不同使用方式。该研究在 UCI 存储库中对各种真实数据集进行了实证评估和比较,这些数据集具有不同的特征,因此可以广泛选择可以应用所提出的新算法的情况。