Przybyła-Kasperek Małgorzata, Marfo Kwabena Frimpong
Institute of Computer Science, Faculty of Science and Technology, University of Silesia in Katowice, Bȩdzińska 39, 41-200 Sosnowiec, Poland.
Entropy (Basel). 2021 Nov 25;23(12):1568. doi: 10.3390/e23121568.
The article concerns the problem of classification based on independent data sets-local decision tables. The aim of the paper is to propose a classification model for dispersed data using a modified k-nearest neighbors algorithm and a neural network. A neural network, more specifically a multilayer perceptron, is used to combine the prediction results obtained based on local tables. Prediction results are stored in the measurement level and generated using a modified -nearest neighbors algorithm. The task of neural networks is to combine these results and provide a common prediction. In the article various structures of neural networks (different number of neurons in the hidden layer) are studied and the results are compared with the results generated by other fusion methods, such as the majority voting, the Borda count method, the sum rule, the method that is based on decision templates and the method that is based on theory of evidence. Based on the obtained results, it was found that the neural network always generates unambiguous decisions, which is a great advantage as most of the other fusion methods generate ties. Moreover, if only unambiguous results were considered, the use of a neural network gives much better results than other fusion methods. If we allow ambiguity, some fusion methods are slightly better, but it is the result of this fact that it is possible to generate few decisions for the test object.
本文关注基于独立数据集——局部决策表的分类问题。本文的目的是使用改进的k近邻算法和神经网络为离散数据提出一种分类模型。神经网络,更具体地说是多层感知器,用于组合基于局部表获得的预测结果。预测结果存储在测量级别,并使用改进的近邻算法生成。神经网络的任务是组合这些结果并提供一个共同的预测。在本文中,研究了神经网络的各种结构(隐藏层中不同数量的神经元),并将结果与其他融合方法生成的结果进行比较,这些方法如多数投票法、博尔达计数法、求和规则、基于决策模板的方法和基于证据理论的方法。基于所得结果发现,神经网络总能产生明确的决策,这是一个很大的优势,因为大多数其他融合方法会产生平局情况。此外,如果只考虑明确的结果,使用神经网络比其他融合方法能得到更好的结果。如果允许存在模糊性,一些融合方法会稍好一些,但这是因为可能为测试对象生成的决策较少。