Suppr超能文献

基于二进制哈希的无监督自适应特征选择

Unsupervised Adaptive Feature Selection With Binary Hashing.

作者信息

Shi Dan, Zhu Lei, Li Jingjing, Zhang Zheng, Chang Xiaojun

出版信息

IEEE Trans Image Process. 2023;32:838-853. doi: 10.1109/TIP.2023.3234497. Epub 2023 Jan 18.

Abstract

Unsupervised feature selection chooses a subset of discriminative features to reduce feature dimension under the unsupervised learning paradigm. Although lots of efforts have been made so far, existing solutions perform feature selection either without any label guidance or with only single pseudo label guidance. They may cause significant information loss and lead to semantic shortage of the selected features as many real-world data, such as images and videos are generally annotated with multiple labels. In this paper, we propose a new Unsupervised Adaptive Feature Selection with Binary Hashing (UAFS-BH) model, which learns binary hash codes as weakly-supervised multi-labels and simultaneously exploits the learned labels to guide feature selection. Specifically, in order to exploit the discriminative information under the unsupervised scenarios, the weakly-supervised multi-labels are learned automatically by specially imposing binary hash constraints on the spectral embedding process to guide the ultimate feature selection. The number of weakly-supervised multi-labels (the number of "1" in binary hash codes) is adaptively determined according to the specific data content. Further, to enhance the discriminative capability of binary labels, we model the intrinsic data structure by adaptively constructing the dynamic similarity graph. Finally, we extend UAFS-BH to multi-view setting as Multi-view Feature Selection with Binary Hashing (MVFS-BH) to handle the multi-view feature selection problem. An effective binary optimization method based on the Augmented Lagrangian Multiple (ALM) is derived to iteratively solve the formulated problem. Extensive experiments on widely tested benchmarks demonstrate the state-of-the-art performance of the proposed method on both single-view and multi-view feature selection tasks. For the purpose of reproducibility, we provide the source codes and testing datasets at https://github.com/shidan0122/UMFS.git..

摘要

无监督特征选择在无监督学习范式下选择有判别力的特征子集以降低特征维度。尽管到目前为止已经做了很多努力,但现有的解决方案在进行特征选择时要么没有任何标签指导,要么只有单一的伪标签指导。由于许多现实世界的数据,如图像和视频通常用多个标签进行标注,它们可能会导致大量信息丢失,并导致所选特征的语义短缺。在本文中,我们提出了一种新的带二进制哈希的无监督自适应特征选择(UAFS-BH)模型,该模型将二进制哈希码学习为弱监督多标签,并同时利用学习到的标签来指导特征选择。具体来说,为了在无监督场景下利用判别信息,通过在谱嵌入过程中特别施加二进制哈希约束来自动学习弱监督多标签,以指导最终的特征选择。弱监督多标签的数量(二进制哈希码中“1”的数量)根据具体的数据内容自适应确定。此外,为了增强二进制标签的判别能力,我们通过自适应构建动态相似性图来对内在数据结构进行建模。最后,我们将UAFS-BH扩展到多视图设置,即带二进制哈希的多视图特征选择(MVFS-BH),以处理多视图特征选择问题。推导了一种基于增广拉格朗日乘子法(ALM)的有效二进制优化方法来迭代求解所提出的问题。在广泛测试的基准上进行的大量实验证明了所提出方法在单视图和多视图特征选择任务上的最优性能。为了便于重现,我们在https://github.com/shidan0122/UMFS.git上提供了源代码和测试数据集。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验