Suppr超能文献

一种基于非线性优化的用于二类分类问题的鲁棒属性加权模型。

A non-linear optimization based robust attribute weighting model for the two-class classification problems.

作者信息

Alhudhaif Adi

机构信息

Department of Computer Science, College of Computer Engineering and Sciences in Al-kharj, Prince Sattam bin Abdulaziz University, Al-kharj, Saudi Arabia.

出版信息

PeerJ Comput Sci. 2023 Sep 25;9:e1598. doi: 10.7717/peerj-cs.1598. eCollection 2023.

Abstract

BACKGROUND

This article aims to determine the coefficients that will reduce the in-class distance and increase the distance between the classes, collecting the data around the cluster centers with meta-heuristic optimization algorithms, thus increasing the classification performance.

METHODS

The proposed mathematical model is based on simple mathematical calculations, and this model is the fitness function of optimization algorithms. Compared to the methods in the literature, optimizing algorithms to obtain fast results is more accessible. Determining the weights by optimization provides more sensitive results than the dataset structure. In the study, the proposed model was used as the fitness function of the metaheuristic optimization algorithms to determine the weighting coefficients. In this context, four different structures were used to test the independence of the results obtained from the algorithm: the particle swarm algorithm (PSO), the bat algorithm (BAT), the gravitational search algorithm (GSA), and the flower pollination algorithm (FPA).

RESULTS

As a result of these processes, a control group from unweighted attributes and four experimental groups from weighted attributes were obtained for each dataset. The classification performance of all datasets to which the weights obtained by the proposed method were applied increased. 100% accuracy rates were obtained in the Iris and Liver Disorders datasets used in the study. From synthetic datasets, from 66.9% (SVM classifier) to 96.4% (GSA Weighting + SVM) in the Full Chain dataset, from 64.6% (LDA classifier) to 80.2% in the Two Spiral datasets (weighted by BA + LDA). As a result of the study, it was seen that the proposed method successfully fulfills the task of moving the attributes to a linear plane in the datasets, especially in classifiers such as SVM and LDA, which have difficulties in non-linear problems, an accuracy rate of 100% was achieved.

摘要

背景

本文旨在确定能够缩小类内距离并增大类间距离的系数,利用元启发式优化算法收集聚类中心周围的数据,从而提高分类性能。

方法

所提出的数学模型基于简单的数学计算,该模型是优化算法的适应度函数。与文献中的方法相比,优化算法以获得快速结果更为便捷。通过优化确定权重比数据集结构能提供更敏感的结果。在本研究中,所提出的模型被用作元启发式优化算法的适应度函数来确定加权系数。在此背景下,使用了四种不同结构来测试从算法获得的结果的独立性:粒子群算法(PSO)、蝙蝠算法(BAT)、引力搜索算法(GSA)和花授粉算法(FPA)。

结果

经过这些过程,对于每个数据集,得到了一个来自未加权属性的对照组和四个来自加权属性的实验组。应用所提出方法获得的权重的所有数据集的分类性能均有所提高。在本研究中使用的鸢尾花和肝脏疾病数据集中获得了100%的准确率。在合成数据集中,全链数据集中从66.9%(支持向量机分类器)到96.4%(GSA加权+支持向量机),在双螺旋数据集中从64.6%(线性判别分析分类器)到80.2%(由蝙蝠算法加权+线性判别分析)。研究结果表明,所提出的方法成功地完成了将数据集中的属性移至线性平面的任务,特别是在诸如支持向量机和线性判别分析等在非线性问题上存在困难的分类器中,实现了100%的准确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7fc7/10557515/5c36d6cc4094/peerj-cs-09-1598-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验