Faculty of Computing, Federal University of Uberlândia, Uberlândia, MG, 38400-902, Brazil.
Shenzhen Key Laboratory of Computational Intelligence, University Key Laboratory of Evolving Intelligent Systems of Guangdong Province, Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China.
Neural Netw. 2019 Feb;110:243-255. doi: 10.1016/j.neunet.2018.12.003. Epub 2018 Dec 14.
Complex networks provide a powerful tool for data representation due to its ability to describe the interplay between topological, functional, and dynamical properties of the input data. A fundamental process in network-based (graph-based) data analysis techniques is the network construction from original data usually in vector form. Here, a natural question is: How to construct an "optimal" network regarding a given processing goal? This paper investigates structural optimization in the context of network-based data classification tasks. To be specific, we propose a particle swarm optimization framework which is responsible for building a network from vector-based data set while optimizing a quality function driven by the classification accuracy. The classification process considers both topological and physical features of the training and test data and employing PageRank measure for classification according to the importance concept of a test instance to each class. Results on artificial and real-world problems reveal that data network generated using structural optimization provides better results in general than those generated by classical network formation methods. Moreover, this investigation suggests that other kinds of network-based machine learning and data mining tasks, such as dimensionality reduction and data clustering, can benefit from the proposed structural optimization method.
复杂网络因其能够描述输入数据的拓扑、功能和动态特性之间的相互作用,为数据表示提供了一种强大的工具。基于网络(基于图)数据分析技术的基本过程是从原始数据(通常为向量形式)构建网络。在这里,一个基本问题是:如何针对给定的处理目标构建“最佳”网络?本文研究了网络数据分类任务背景下的结构优化问题。具体来说,我们提出了一种粒子群优化框架,负责从基于向量的数据集构建网络,同时优化由分类准确性驱动的质量函数。分类过程考虑了训练数据和测试数据的拓扑和物理特征,并根据测试实例对每个类的重要性概念,采用 PageRank 度量进行分类。在人工和真实世界问题上的结果表明,使用结构优化生成的数据网络通常比使用经典网络形成方法生成的网络具有更好的结果。此外,这项研究表明,其他类型的基于网络的机器学习和数据挖掘任务,如降维和数据聚类,都可以从所提出的结构优化方法中受益。