• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于显著目标排序的视听认知优化策略,用于智能视觉假体系统。

An audiovisual cognitive optimization strategy guided by salient object ranking for intelligent visual prothesis systems.

机构信息

School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, People's Republic of China.

Graduate School of Biomedical Engineering, UNSW, Sydney, NSW 2052, Australia.

出版信息

J Neural Eng. 2024 Nov 29;21(6). doi: 10.1088/1741-2552/ad94a4.

DOI:10.1088/1741-2552/ad94a4
PMID:39569905
Abstract

Visual prostheses are effective tools for restoring vision, yet real-world complexities pose ongoing challenges. The progress in AI has led to the emergence of the concept of intelligent visual prosthetics with auditory support, leveraging deep learning to create practical artificial vision perception beyond merely restoring natural sight for the blind.This study introduces an object-based attention mechanism that simulates human gaze points when observing the external world to descriptions of physical regions. By transforming this mechanism into a ranking problem of salient entity regions, we introduce prior visual attention cues to build a new salient object ranking (SaOR) dataset, and propose a SaOR network aimed at providing depth perception for prosthetic vision. Furthermore, we propose a SaOR-guided image description method to align with human observation patterns, toward providing additional visual information by auditory feedback. Finally, the integration of the two aforementioned algorithms constitutes an audiovisual cognitive optimization strategy for prosthetic vision.Through conducting psychophysical experiments based on scene description tasks under simulated prosthetic vision, we verify that the SaOR method improves the subjects' performance in terms of object identification and understanding the correlation among objects. Additionally, the cognitive optimization strategy incorporating image description further enhances their prosthetic visual cognition.This offers valuable technical insights for designing next-generation intelligent visual prostheses and establishes a theoretical groundwork for developing their visual information processing strategies. Code will be made publicly available.

摘要

视觉假体是恢复视力的有效工具,但现实世界的复杂性带来了持续的挑战。人工智能的进步催生了具有听觉支持的智能视觉假体的概念,利用深度学习为盲人创造超越恢复自然视力的实用人工视觉感知。

本研究引入了一种基于对象的注意力机制,模拟人类观察外部世界时对物理区域的描述的注视点。通过将该机制转化为显著实体区域的排序问题,我们引入了先前的视觉注意力提示来构建新的显著对象排序(SaOR)数据集,并提出了一种旨在为假体视觉提供深度感知的 SaOR 网络。此外,我们提出了一种 SaOR 引导的图像描述方法,以与人类观察模式保持一致,通过听觉反馈提供额外的视觉信息。最后,将上述两种算法集成到假体视觉的视听认知优化策略中。

通过在模拟假体视觉下进行基于场景描述任务的心理物理实验,我们验证了 SaOR 方法在提高对象识别和理解对象之间相关性方面的性能。此外,结合图像描述的认知优化策略进一步增强了他们的假体视觉认知。这为设计下一代智能视觉假体提供了有价值的技术见解,并为开发其视觉信息处理策略奠定了理论基础。代码将公开提供。

相似文献

1
An audiovisual cognitive optimization strategy guided by salient object ranking for intelligent visual prothesis systems.基于显著目标排序的视听认知优化策略,用于智能视觉假体系统。
J Neural Eng. 2024 Nov 29;21(6). doi: 10.1088/1741-2552/ad94a4.
2
Image processing strategies based on saliency segmentation for object recognition under simulated prosthetic vision.基于显著分割的图像处理策略在模拟假体视觉下的目标识别。
Artif Intell Med. 2018 Jan;84:64-78. doi: 10.1016/j.artmed.2017.11.001. Epub 2017 Nov 10.
3
An optimized content-aware image retargeting method: toward expanding the perceived visual field of the high-density retinal prosthesis recipients.一种优化的基于内容感知的图像重定目标方法:用于扩展高密度视网膜假体接受者的感知视野。
J Neural Eng. 2018 Apr;15(2):026025. doi: 10.1088/1741-2552/aa966d.
4
Image Processing Strategies Based on a Visual Saliency Model for Object Recognition Under Simulated Prosthetic Vision.基于视觉显著性模型的图像处理策略用于模拟假肢视觉下的目标识别
Artif Organs. 2016 Jan;40(1):94-100. doi: 10.1111/aor.12498. Epub 2015 May 15.
5
Sound effects: Multimodal input helps infants find displaced objects.音效:多模态输入帮助婴儿找到移位的物体。
Br J Dev Psychol. 2017 Sep;35(3):317-333. doi: 10.1111/bjdp.12165. Epub 2016 Nov 21.
6
Semantic and structural image segmentation for prosthetic vision.假体视觉的语义和结构图像分割。
PLoS One. 2020 Jan 29;15(1):e0227677. doi: 10.1371/journal.pone.0227677. eCollection 2020.
7
Clinical Progress and Optimization of Information Processing in Artificial Visual Prostheses.人工视觉假体中的信息处理的临床进展和优化。
Sensors (Basel). 2022 Aug 30;22(17):6544. doi: 10.3390/s22176544.
8
Pre-processing visual scenes for retinal prosthesis systems: A comprehensive review.预处理视网膜假体系统的视觉场景:全面综述。
Artif Organs. 2024 Nov;48(11):1223-1250. doi: 10.1111/aor.14824. Epub 2024 Jul 18.
9
PVGAN: a generative adversarial network for object simplification in prosthetic vision.PVGAN:用于假体视觉中物体简化的生成对抗网络。
J Neural Eng. 2022 Sep 7;19(5). doi: 10.1088/1741-2552/ac8acf.
10
Congruent audiovisual speech enhances auditory attention decoding with EEG.视听语音一致增强了 EEG 对听觉注意力的解码。
J Neural Eng. 2019 Nov 6;16(6):066033. doi: 10.1088/1741-2552/ab4340.

引用本文的文献

1
Advancements in Ocular Neuro-Prosthetics: Bridging Neuroscience and Information and Communication Technology for Vision Restoration.眼部神经假体的进展:连接神经科学与信息通信技术以恢复视力
Biology (Basel). 2025 Jan 28;14(2):134. doi: 10.3390/biology14020134.