Suppr超能文献

多标签分类中基于主题的实例与特征选择

Topic-Based Instance and Feature Selection in Multilabel Classification.

作者信息

Ma Jianghong, Chow Tommy W S

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Jan;33(1):315-329. doi: 10.1109/TNNLS.2020.3027745. Epub 2022 Jan 5.

Abstract

Multilabel learning has been extensively studied in the past years, as it has many applications in different domains. It aims at annotating the labels for unseen data according to training data, which are often high dimensional in both instance and feature levels. The training data often have noisy and redundant information on these two levels. As an effective data preprocessing step, instance and feature selection should both be performed to find relevant training instances for each testing instance and relevant features for each label, respectively. However, most of the existing methods overlook the input-output correlation in each kind of selection. It will lead to the performance degradation. This article presents a formulation for multilabel learning from a topic view that exploits the dependence between features and labels in a topic space. We can perform effective instance and feature selection in the latent topic space, as the relationship between the input and output spaces is well captured in this space. The results from intensive experiments on various benchmarks demonstrate the effectiveness of the proposed framework.

摘要

在过去几年中,多标签学习得到了广泛研究,因为它在不同领域有许多应用。它旨在根据训练数据为未见数据标注标签,这些训练数据在实例和特征层面通常都是高维的。训练数据在这两个层面往往存在噪声和冗余信息。作为有效的数据预处理步骤,应同时进行实例选择和特征选择,以便分别为每个测试实例找到相关的训练实例,为每个标签找到相关的特征。然而,大多数现有方法在每种选择中都忽略了输入-输出相关性。这将导致性能下降。本文从主题视角提出了一种多标签学习的公式化方法,该方法利用了主题空间中特征与标签之间的依赖关系。我们可以在潜在主题空间中进行有效的实例和特征选择,因为输入空间和输出空间之间的关系在这个空间中得到了很好的体现。在各种基准上进行的大量实验结果证明了所提出框架的有效性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验