Suppr超能文献

基于结构化稀疏正则化的非线性特征选择神经网络

Nonlinear Feature Selection Neural Network via Structured Sparse Regularization.

作者信息

Wang Rong, Bian Jintang, Nie Feiping, Li Xuelong

出版信息

IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):9493-9505. doi: 10.1109/TNNLS.2022.3209716. Epub 2023 Oct 27.

Abstract

Feature selection is an important and effective data preprocessing method, which can remove the noise and redundant features while retaining the relevant and discriminative features in high-dimensional data. In real-world applications, the relationships between data samples and their labels are usually nonlinear. However, most of the existing feature selection models focus on learning a linear transformation matrix, which cannot capture such a nonlinear structure in practice and will degrade the performance of downstream tasks. To address the issue, we propose a novel nonlinear feature selection method to select those most relevant and discriminative features in high-dimensional dataset. Specifically, our method learns the nonlinear structure of high-dimensional data by a neural network with cross entropy loss function, and then using the structured sparsity norm such as l -norm to regularize the weights matrix connecting the input layer and the first hidden layer of the neural network model to learn weight of each feature. Therefore, a structural sparse weights matrix is obtained by conducting nonlinear learning based on a neural network with structured sparsity regularization. Then, we use the gradient descent method to achieve the optimal solution of the proposed model. Evaluating the experimental results on several synthetic datasets and real-world datasets shows the effectiveness and superiority of the proposed nonlinear feature selection model.

摘要

特征选择是一种重要且有效的数据预处理方法,它可以去除高维数据中的噪声和冗余特征,同时保留相关且有区分性的特征。在实际应用中,数据样本与其标签之间的关系通常是非线性的。然而,现有的大多数特征选择模型都专注于学习线性变换矩阵,这在实际中无法捕捉到这种非线性结构,并且会降低下游任务的性能。为了解决这个问题,我们提出了一种新颖的非线性特征选择方法,用于在高维数据集中选择那些最相关且有区分性的特征。具体来说,我们的方法通过具有交叉熵损失函数的神经网络学习高维数据的非线性结构,并使用诸如l -范数之类的结构化稀疏范数来正则化连接神经网络模型输入层和第一隐藏层的权重矩阵,以学习每个特征的权重。因此,通过基于具有结构化稀疏正则化的神经网络进行非线性学习,可获得一个结构稀疏的权重矩阵。然后,我们使用梯度下降法来实现所提出模型的最优解。在几个合成数据集和真实世界数据集上评估实验结果表明了所提出的非线性特征选择模型的有效性和优越性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验