• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

面向婴儿和机器人的双模态语言学习的视听注意力发展模型(MAVA)。

A developmental model of audio-visual attention (MAVA) for bimodal language learning in infants and robots.

机构信息

ETIS, UMR 8051, ENSEA, CY Cergy Paris Université, CNRS, Cergy-Pontoise, France.

Service de Psychiatrie de l'Enfant et de l'Adolescent, Hôpital Pitié-Salpêtrière, AP-HP, Paris, France.

出版信息

Sci Rep. 2024 Sep 3;14(1):20492. doi: 10.1038/s41598-024-69245-2.

DOI:10.1038/s41598-024-69245-2
PMID:39242623
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11379723/
Abstract

A social individual needs to effectively manage the amount of complex information in his or her environment relative to his or her own purpose to obtain relevant information. This paper presents a neural architecture aiming to reproduce attention mechanisms (alerting/orienting/selecting) that are efficient in humans during audiovisual tasks in robots. We evaluated the system based on its ability to identify relevant sources of information on faces of subjects emitting vowels. We propose a developmental model of audio-visual attention (MAVA) combining Hebbian learning and a competition between saliency maps based on visual movement and audio energy. MAVA effectively combines bottom-up and top-down information to orient the system toward pertinent areas. The system has several advantages, including online and autonomous learning abilities, low computation time and robustness to environmental noise. MAVA outperforms other artificial models for detecting speech sources under various noise conditions.

摘要

社会个体需要有效地管理其环境中与其自身目的相关的复杂信息量,以获取相关信息。本文提出了一种旨在复制人类在视听任务中注意力机制(警觉/定向/选择)的神经架构。我们根据系统识别发出元音的主体面部相关信息源的能力对系统进行了评估。我们提出了一种结合了赫布学习和基于视觉运动和音频能量的显着性图之间竞争的视听注意的发展模型(MAVA)。MAVA 有效地结合了自下而上和自上而下的信息,使系统能够定位相关区域。该系统具有多种优势,包括在线和自主学习能力、低计算时间和对环境噪声的鲁棒性。MAVA 在各种噪声条件下检测语音源的性能优于其他人工模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86e/11379723/2cbf5601d6f0/41598_2024_69245_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86e/11379723/e63143da2a05/41598_2024_69245_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86e/11379723/1566d4ef4c3a/41598_2024_69245_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86e/11379723/71abd47cbc62/41598_2024_69245_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86e/11379723/4d6d8433c043/41598_2024_69245_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86e/11379723/c47461d6937f/41598_2024_69245_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86e/11379723/2cbf5601d6f0/41598_2024_69245_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86e/11379723/e63143da2a05/41598_2024_69245_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86e/11379723/1566d4ef4c3a/41598_2024_69245_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86e/11379723/71abd47cbc62/41598_2024_69245_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86e/11379723/4d6d8433c043/41598_2024_69245_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86e/11379723/c47461d6937f/41598_2024_69245_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f86e/11379723/2cbf5601d6f0/41598_2024_69245_Fig6_HTML.jpg

相似文献

1
A developmental model of audio-visual attention (MAVA) for bimodal language learning in infants and robots.面向婴儿和机器人的双模态语言学习的视听注意力发展模型(MAVA)。
Sci Rep. 2024 Sep 3;14(1):20492. doi: 10.1038/s41598-024-69245-2.
2
Learning to match auditory and visual speech cues: social influences on acquisition of phonological categories.学习匹配听觉和视觉言语线索:社会因素对音位范畴习得的影响
Child Dev. 2015 Mar-Apr;86(2):362-78. doi: 10.1111/cdev.12320. Epub 2014 Nov 18.
3
Bimodal emotion congruency is critical to preverbal infants' abstract rule learning.双峰情绪一致性对于前语言婴儿的抽象规则学习至关重要。
Dev Sci. 2016 May;19(3):382-93. doi: 10.1111/desc.12319. Epub 2015 Aug 17.
4
Cross-modal matching of audio-visual German and French fluent speech in infancy.婴儿期德语和法语流利语音的跨模态视听匹配
PLoS One. 2014 Feb 20;9(2):e89275. doi: 10.1371/journal.pone.0089275. eCollection 2014.
5
Learning bimodal structure in audio-visual data.学习视听数据中的双峰结构。
IEEE Trans Neural Netw. 2009 Dec;20(12):1898-910. doi: 10.1109/TNN.2009.2032182.
6
An object-based visual attention model for robotic applications.一种用于机器人应用的基于对象的视觉注意力模型。
IEEE Trans Syst Man Cybern B Cybern. 2010 Oct;40(5):1398-412. doi: 10.1109/TSMCB.2009.2038895. Epub 2010 Feb 2.
7
Language experience influences audiovisual speech integration in unimodal and bimodal bilingual infants.语言经验影响单模态和双模态双语婴儿的视听言语整合。
Dev Sci. 2019 Jan;22(1):e12701. doi: 10.1111/desc.12701. Epub 2018 Jul 16.
8
Individual differences in the acquisition of non-linguistic audio-visual associations in 5 year olds.5 岁儿童在非语言视听联想习得中的个体差异。
Dev Sci. 2020 Jul;23(4):e12913. doi: 10.1111/desc.12913. Epub 2019 Nov 3.
9
Audio-visual perception system for a humanoid robotic head.用于类人机器人头部的视听感知系统。
Sensors (Basel). 2014 May 28;14(6):9522-45. doi: 10.3390/s140609522.
10
Intersensory redundancy promotes infant detection of prosody in infant-directed speech.感觉冗余促进婴儿对婴儿指向言语中韵律的察觉。
J Exp Child Psychol. 2019 Jul;183:295-309. doi: 10.1016/j.jecp.2019.02.008. Epub 2019 Apr 4.

本文引用的文献

1
Attention recruits frontal cortex in human infants.注意吸引人类婴儿的额叶皮层。
Proc Natl Acad Sci U S A. 2021 Mar 23;118(12). doi: 10.1073/pnas.2021474118.
2
Developing attention in typical children related to disabilities.发展典型残疾儿童的注意力。
Handb Clin Neurol. 2020;173:215-223. doi: 10.1016/B978-0-444-64150-2.00019-8.
3
: Some Perspectives on the Origins of Non-synchronous Cross-Sensory Associations.关于非同步跨感觉关联起源的一些观点。
Front Psychol. 2019 Mar 7;10:523. doi: 10.3389/fpsyg.2019.00523. eCollection 2019.
4
Effects of multimodal synchrony on infant attention and heart rate during events with social and nonsocial stimuli.多模态同步对具有社会和非社会刺激的事件中婴儿注意力和心率的影响。
J Exp Child Psychol. 2019 Feb;178:283-294. doi: 10.1016/j.jecp.2018.10.006. Epub 2018 Nov 13.
5
The influence of bilingualism on the preference for the mouth region of dynamic faces.双语能力对动态面部中嘴部区域偏好的影响。
Dev Sci. 2017 Jan;20(1). doi: 10.1111/desc.12446. Epub 2016 May 15.
6
Robots Learn to Recognize Individuals from Imitative Encounters with People and Avatars.机器人通过与人类和虚拟化身的模仿互动来学习识别个体。
Sci Rep. 2016 Feb 4;6:19908. doi: 10.1038/srep19908.
7
Bottom-up and top-down attention: different processes and overlapping neural systems.自下而上和自上而下的注意力:不同的过程与重叠的神经系统。
Neuroscientist. 2014 Oct;20(5):509-21. doi: 10.1177/1073858413514136. Epub 2013 Dec 20.
8
Bottom-up attention orienting in young children with autism.自闭症幼儿的自下而上注意力定向
J Autism Dev Disord. 2014 Mar;44(3):664-73. doi: 10.1007/s10803-013-1925-5.
9
Modeling the minimal newborn's intersubjective mind: the visuotopic-somatotopic alignment hypothesis in the superior colliculus.建模最小新生儿的主体间心理:上丘中的视拓扑-体拓扑对齐假设。
PLoS One. 2013 Jul 26;8(7):e69474. doi: 10.1371/journal.pone.0069474. Print 2013.
10
Executive functions.执行功能。
Annu Rev Psychol. 2013;64:135-68. doi: 10.1146/annurev-psych-113011-143750. Epub 2012 Sep 27.