• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

目标水平之上和之下的图像解读。

Image interpretation above and below the object level.

作者信息

Ben-Yosef Guy, Ullman Shimon

机构信息

Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

出版信息

Interface Focus. 2018 Aug 6;8(4):20180020. doi: 10.1098/rsfs.2018.0020. Epub 2018 Jun 15.

DOI:10.1098/rsfs.2018.0020
PMID:29951197
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6015807/
Abstract

Computational models of vision have advanced in recent years at a rapid rate, rivalling in some areas human-level performance. Much of the progress to date has focused on analysing the visual scene at the object level-the recognition and localization of objects in the scene. Human understanding of images reaches a richer and deeper image understanding both 'below' the object level, such as identifying and localizing object parts and sub-parts, as well as 'above' the object level, such as identifying object relations, and agents with their actions and interactions. In both cases, understanding depends on recovering meaningful structures in the image, and their components, properties and inter-relations, a process referred here as 'image interpretation'. In this paper, we describe recent directions, based on human and computer vision studies, towards human-like image interpretation, beyond the reach of current schemes, both below the object level, as well as some aspects of image interpretation at the level of meaningful configurations beyond the recognition of individual objects, and in particular, interactions between two people in close contact. In both cases the recognition process depends on the detailed interpretation of so-called 'minimal images', and at both levels recognition depends on combining 'bottom-up' processing, proceeding from low to higher levels of a processing hierarchy, together with 'top-down' processing, proceeding from high to lower levels stages of visual analysis.

摘要

近年来,视觉计算模型发展迅速,在某些领域已可媲美人类水平的表现。迄今为止,大部分进展都集中在物体层面分析视觉场景,即识别和定位场景中的物体。人类对图像的理解在物体层面“之下”更为丰富和深入,比如识别和定位物体的部分及子部分,在物体层面“之上”也是如此,比如识别物体关系以及带有其动作和交互的主体。在这两种情况下,理解都依赖于恢复图像中有意义的结构及其组成部分、属性和相互关系,此过程在这里称为“图像解释”。在本文中,我们描述基于人类和计算机视觉研究的最新方向,以实现超越当前方案的类人图像解释,这不仅包括物体层面之下的情况,还包括超越单个物体识别的有意义配置层面的图像解释的某些方面,特别是两人密切接触时的交互。在这两种情况下,识别过程都依赖于对所谓“最小图像”的详细解释,并且在两个层面上,识别都依赖于将从处理层次结构的低到高层面进行的“自下而上”处理与从视觉分析的高到低层面阶段进行的“自上而下”处理相结合。

相似文献

1
Image interpretation above and below the object level.目标水平之上和之下的图像解读。
Interface Focus. 2018 Aug 6;8(4):20180020. doi: 10.1098/rsfs.2018.0020. Epub 2018 Jun 15.
2
Full interpretation of minimal images.最小影像的全面解读。
Cognition. 2018 Feb;171:65-84. doi: 10.1016/j.cognition.2017.10.006. Epub 2017 Nov 4.
3
Automatic anatomy recognition in whole-body PET/CT images.全身PET/CT图像中的自动解剖识别
Med Phys. 2016 Jan;43(1):613. doi: 10.1118/1.4939127.
4
Visual recognition based on temporal cortex cells: viewer-centred processing of pattern configuration.基于颞叶皮质细胞的视觉识别:以观察者为中心的模式配置处理。
Z Naturforsch C J Biosci. 1998 Jul-Aug;53(7-8):518-41. doi: 10.1515/znc-1998-7-807.
5
Image interpretation by a single bottom-up top-down cycle.通过单个自下而上-自上而下循环进行图像解读。
Proc Natl Acad Sci U S A. 2008 Sep 23;105(38):14298-303. doi: 10.1073/pnas.0800968105. Epub 2008 Sep 16.
6
What are the Visual Features Underlying Rapid Object Recognition?快速物体识别的视觉特征是什么?
Front Psychol. 2011 Nov 15;2:326. doi: 10.3389/fpsyg.2011.00326. eCollection 2011.
7
Image processing strategies based on saliency segmentation for object recognition under simulated prosthetic vision.基于显著分割的图像处理策略在模拟假体视觉下的目标识别。
Artif Intell Med. 2018 Jan;84:64-78. doi: 10.1016/j.artmed.2017.11.001. Epub 2017 Nov 10.
8
The representational hierarchy in human and artificial visual systems in the presence of object-scene regularities.在存在物体-场景规律的情况下,人类和人工视觉系统中的表象层次结构。
PLoS Comput Biol. 2023 Apr 28;19(4):e1011086. doi: 10.1371/journal.pcbi.1011086. eCollection 2023 Apr.
9
Causal Evidence for a Double Dissociation between Object- and Scene-Selective Regions of Visual Cortex: A Preregistered TMS Replication Study.视觉皮层的物体和场景选择性区域之间的双重分离的因果证据:一项预先注册的 TMS 复制研究。
J Neurosci. 2021 Jan 27;41(4):751-756. doi: 10.1523/JNEUROSCI.2162-20.2020. Epub 2020 Dec 1.
10
Combined top-down/bottom-up segmentation.自上而下/自下而上相结合的分割
IEEE Trans Pattern Anal Mach Intell. 2008 Dec;30(12):2109-25. doi: 10.1109/TPAMI.2007.70840.

引用本文的文献

1
What Does a Language-And-Vision Transformer See: The Impact of Semantic Information on Visual Representations.语言与视觉Transformer看到了什么:语义信息对视觉表征的影响。
Front Artif Intell. 2021 Dec 3;4:767971. doi: 10.3389/frai.2021.767971. eCollection 2021.
2
Minimal videos: Trade-off between spatial and temporal information in human and machine vision.极简视频:人类与机器视觉中空间与时间信息的权衡
Cognition. 2020 Aug;201:104263. doi: 10.1016/j.cognition.2020.104263. Epub 2020 Apr 20.

本文引用的文献

1
OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields.OpenPose:基于部件亲和力字段的实时多人 2D 姿态估计。
IEEE Trans Pattern Anal Mach Intell. 2021 Jan;43(1):172-186. doi: 10.1109/TPAMI.2019.2929257. Epub 2020 Dec 4.
2
Full interpretation of minimal images.最小影像的全面解读。
Cognition. 2018 Feb;171:65-84. doi: 10.1016/j.cognition.2017.10.006. Epub 2017 Nov 4.
3
Perceiving social interactions in the posterior superior temporal sulcus.在后上颞叶皮层感知社会互动。
Proc Natl Acad Sci U S A. 2017 Oct 24;114(43):E9145-E9152. doi: 10.1073/pnas.1714471114. Epub 2017 Oct 9.
4
A dedicated network for social interaction processing in the primate brain.灵长类大脑中用于社交互动处理的专用网络。
Science. 2017 May 19;356(6339):745-749. doi: 10.1126/science.aam6383.
5
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.DeepLab:基于深度卷积网络、空洞卷积和全连接条件随机场的语义图像分割。
IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.
6
Show and Tell: Lessons Learned from the 2015 MSCOCO Image Captioning Challenge.展示与讲述:从 2015 年 MSCOCO 图像字幕挑战赛中学到的经验教训。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):652-663. doi: 10.1109/TPAMI.2016.2587640. Epub 2016 Jul 7.
7
Fully Convolutional Networks for Semantic Segmentation.全卷积网络用于语义分割。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):640-651. doi: 10.1109/TPAMI.2016.2572683. Epub 2016 May 24.
8
Using goal-driven deep learning models to understand sensory cortex.利用目标驱动的深度学习模型理解感觉皮层。
Nat Neurosci. 2016 Mar;19(3):356-65. doi: 10.1038/nn.4244.
9
Atoms of recognition in human and computer vision.人类视觉与计算机视觉中的识别原子。
Proc Natl Acad Sci U S A. 2016 Mar 8;113(10):2744-9. doi: 10.1073/pnas.1513198113. Epub 2016 Feb 16.
10
Close Human Interaction Recognition Using Patch-Aware Models.基于补丁感知模型的近距人类交互识别
IEEE Trans Image Process. 2016 Jan;25(1):167-78. doi: 10.1109/TIP.2015.2498410. Epub 2015 Nov 5.