• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

启发式注意力表示学习的自监督预训练。

Heuristic Attention Representation Learning for Self-Supervised Pretraining.

机构信息

Department of Computer Science and Information Engineering, National Central University, Taoyuan 3200, Taiwan.

AI Research Center, Hon Hai Research Institute, Taipei 114699, Taiwan.

出版信息

Sensors (Basel). 2022 Jul 10;22(14):5169. doi: 10.3390/s22145169.

DOI:10.3390/s22145169
PMID:35890847
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9320898/
Abstract

Recently, self-supervised learning methods have been shown to be very powerful and efficient for yielding robust representation learning by maximizing the similarity across different augmented views in embedding vector space. However, the main challenge is generating different views with random cropping; the semantic feature might exist differently across different views leading to inappropriately maximizing similarity objective. We tackle this problem by introducing euristic ttention epresentation earning (HARL). This self-supervised framework relies on the joint embedding architecture in which the two neural networks are trained to produce similar embedding for different augmented views of the same image. HARL framework adopts prior visual object-level attention by generating a heuristic mask proposal for each training image and maximizes the abstract object-level embedding on vector space instead of whole image representation from previous works. As a result, HARL extracts the quality semantic representation from each training sample and outperforms self-supervised baselines on several downstream tasks. In addition, we provide efficient techniques based on conventional computer vision and deep learning methods for generating heuristic mask proposals on natural image datasets. Our HARL achieves +1.3% advancement in the ImageNet semi-supervised learning benchmark and +0.9% improvement in AP of the COCO object detection task over the previous state-of-the-art method BYOL. Our code implementation is available for both TensorFlow and PyTorch frameworks.

摘要

最近,自监督学习方法通过最大化嵌入向量空间中不同增强视图之间的相似性,被证明在产生鲁棒的表示学习方面非常强大和高效。然而,主要的挑战是通过随机裁剪生成不同的视图;语义特征可能在不同的视图中存在差异,导致相似性目标的不适当最大化。我们通过引入启发式注意力表示学习(HARL)来解决这个问题。这个自监督框架依赖于联合嵌入架构,其中两个神经网络被训练为对同一图像的不同增强视图产生相似的嵌入。HARL 框架通过为每个训练图像生成启发式掩模提案来采用先前的视觉对象级注意力,并在向量空间上最大化抽象对象级别的嵌入,而不是像以前的工作那样从整个图像表示中最大化。结果,HARL 从每个训练样本中提取出高质量的语义表示,并在几个下游任务上优于自监督基线。此外,我们还提供了基于传统计算机视觉和深度学习方法的高效技术,用于在自然图像数据集上生成启发式掩模提案。我们的 HARL 在 ImageNet 半监督学习基准上实现了+1.3%的提升,在 COCO 目标检测任务的 AP 上比以前的最先进方法 BYOL 提高了+0.9%。我们的代码实现同时适用于 TensorFlow 和 PyTorch 框架。

相似文献

1
Heuristic Attention Representation Learning for Self-Supervised Pretraining.启发式注意力表示学习的自监督预训练。
Sensors (Basel). 2022 Jul 10;22(14):5169. doi: 10.3390/s22145169.
2
RepCo: Replenish sample views with better consistency for contrastive learning.RepCo:通过更好的一致性来补充样本视图,以进行对比学习。
Neural Netw. 2023 Nov;168:171-179. doi: 10.1016/j.neunet.2023.09.004. Epub 2023 Sep 11.
3
Linear semantic transformation for semi-supervised medical image segmentation.线性语义变换在半监督医学图像分割中的应用。
Comput Biol Med. 2024 May;173:108331. doi: 10.1016/j.compbiomed.2024.108331. Epub 2024 Mar 21.
4
Self-supervised driven consistency training for annotation efficient histopathology image analysis.用于高效标注组织病理学图像分析的自监督驱动一致性训练
Med Image Anal. 2022 Jan;75:102256. doi: 10.1016/j.media.2021.102256. Epub 2021 Oct 13.
5
A knowledge-based learning framework for self-supervised pre-training towards enhanced recognition of biomedical microscopy images.基于知识的学习框架,用于自我监督的预训练,以增强对生物医学显微镜图像的识别。
Neural Netw. 2023 Oct;167:810-826. doi: 10.1016/j.neunet.2023.09.001. Epub 2023 Sep 12.
6
MsVRL: Self-Supervised Multiscale Visual Representation Learning via Cross-Level Consistency for Medical Image Segmentation.MsVRL:通过跨层级一致性进行医学图像分割的自监督多尺度视觉表征学习
IEEE Trans Med Imaging. 2023 Jan;42(1):91-102. doi: 10.1109/TMI.2022.3204551. Epub 2022 Dec 29.
7
Self-supervised learning for medical image analysis: Discriminative, restorative, or adversarial?用于医学图像分析的自监督学习:判别式、恢复式还是对抗式?
Med Image Anal. 2024 May;94:103086. doi: 10.1016/j.media.2024.103086. Epub 2024 Jan 28.
8
Adaptive self-supervised learning for sequential recommendation.自适应自监督学习在序列推荐中的应用。
Neural Netw. 2024 Nov;179:106570. doi: 10.1016/j.neunet.2024.106570. Epub 2024 Jul 24.
9
Deep semi-supervised learning via dynamic anchor graph embedding in latent space.基于潜在空间动态锚图嵌入的深度半监督学习。
Neural Netw. 2022 Feb;146:350-360. doi: 10.1016/j.neunet.2021.11.026. Epub 2021 Dec 1.
10
Any region can be perceived equally and effectively on rotation pretext task using full rotation and weighted-region mixture.在旋转预任务中,使用全旋转和加权区域混合,可以平等且有效地感知任何区域。
Neural Netw. 2024 Aug;176:106350. doi: 10.1016/j.neunet.2024.106350. Epub 2024 Apr 30.

引用本文的文献

1
An optimized multi-task contrastive learning framework for HIFU lesion detection and segmentation.一种用于高强度聚焦超声(HIFU)病灶检测与分割的优化多任务对比学习框架。
Sci Rep. 2025 Aug 13;15(1):29666. doi: 10.1038/s41598-025-99783-2.
2
Computer Vision and Machine Learning for Intelligent Sensing Systems.计算机视觉与机器学习在智能传感系统中的应用。
Sensors (Basel). 2023 Apr 23;23(9):4214. doi: 10.3390/s23094214.

本文引用的文献

1
Salient Object Detection in the Deep Learning Era: An In-Depth Survey.深度学习时代的显著目标检测:深入调查。
IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):3239-3259. doi: 10.1109/TPAMI.2021.3051099. Epub 2022 May 5.
2
Learning Representations for Neural Network-Based Classification Using the Information Bottleneck Principle.使用信息瓶颈原理学习基于神经网络的分类表示。
IEEE Trans Pattern Anal Mach Intell. 2020 Sep;42(9):2225-2239. doi: 10.1109/TPAMI.2019.2909031. Epub 2019 Apr 2.
3
Mask R-CNN.Mask R-CNN。
IEEE Trans Pattern Anal Mach Intell. 2020 Feb;42(2):386-397. doi: 10.1109/TPAMI.2018.2844175. Epub 2018 Jun 5.
4
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.更快的 R-CNN:基于区域建议网络的实时目标检测。
IEEE Trans Pattern Anal Mach Intell. 2017 Jun;39(6):1137-1149. doi: 10.1109/TPAMI.2016.2577031. Epub 2016 Jun 6.
5
Salient Object Detection: A Benchmark.显著目标检测:基准
IEEE Trans Image Process. 2015 Dec;24(12):5706-22. doi: 10.1109/TIP.2015.2487833. Epub 2015 Oct 7.
6
Quantitative analysis of human-model agreement in visual saliency modeling: a comparative study.人类模型在视觉显著性建模中的一致性定量分析:一项比较研究。
IEEE Trans Image Process. 2013 Jan;22(1):55-69. doi: 10.1109/TIP.2012.2210727. Epub 2012 Jul 30.
7
Visual and oculomotor selection: links, causes and implications for spatial attention.视觉与眼球运动选择:空间注意力的联系、成因及影响
Trends Cogn Sci. 2006 Mar;10(3):124-30. doi: 10.1016/j.tics.2006.01.001. Epub 2006 Feb 15.
8
Eye movements in natural behavior.自然行为中的眼球运动。
Trends Cogn Sci. 2005 Apr;9(4):188-94. doi: 10.1016/j.tics.2005.02.009.