• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Multimodal Attention Network for Trauma Activity Recognition from Spoken Language and Environmental Sound.用于从口语和环境声音中识别创伤活动的多模态注意力网络。
Proc (IEEE Int Conf Healthc Inform). 2019 Jun;2019. doi: 10.1109/ichi.2019.8904713. Epub 2019 Nov 21.
2
Speech-Based Activity Recognition for Trauma Resuscitation.基于语音的创伤复苏活动识别
Proc (IEEE Int Conf Healthc Inform). 2020 Nov-Dec;2020. doi: 10.1109/ichi48887.2020.9374372. Epub 2021 Mar 12.
3
Language-Based Process Phase Detection in the Trauma Resuscitation.创伤复苏中基于语言的过程阶段检测
Proc (IEEE Int Conf Healthc Inform). 2017 Aug;2017:239-247. doi: 10.1109/ICHI.2017.50. Epub 2017 Sep 14.
4
Hybrid Attention based Multimodal Network for Spoken Language Classification.基于混合注意力的多模态口语分类网络
Proc Conf Assoc Comput Linguist Meet. 2018 Aug;2018:2379-2390.
5
LGCCT: A Light Gated and Crossed Complementation Transformer for Multimodal Speech Emotion Recognition.LGCCT:一种用于多模态语音情感识别的光控交叉互补变换器
Entropy (Basel). 2022 Jul 21;24(7):1010. doi: 10.3390/e24071010.
6
DEEP MULTIMODAL LEARNING FOR EMOTION RECOGNITION IN SPOKEN LANGUAGE.用于口语情感识别的深度多模态学习
Proc IEEE Int Conf Acoust Speech Signal Process. 2018 Apr;2018:5079-5083. doi: 10.1109/ICASSP.2018.8462440. Epub 2018 Sep 13.
7
Real-time Context-Aware Multimodal Network for Activity and Activity-Stage Recognition from Team Communication in Dynamic Clinical Settings.用于动态临床环境中团队沟通的活动及活动阶段识别的实时上下文感知多模态网络
Proc ACM Interact Mob Wearable Ubiquitous Technol. 2023 Mar;7(1). doi: 10.1145/3580798. Epub 2023 Mar 28.
8
Speech Intention Classification with Multimodal Deep Learning.基于多模态深度学习的语音意图分类
Adv Artif Intell. 2017 May;10233:260-271. doi: 10.1007/978-3-319-57351-9_30. Epub 2017 Apr 11.
9
MLNet: a multi-level multimodal named entity recognition architecture.MLNet:一种多层次多模态命名实体识别架构。
Front Neurorobot. 2023 Jun 20;17:1181143. doi: 10.3389/fnbot.2023.1181143. eCollection 2023.
10
Hierarchical Attention-Based Multimodal Fusion Network for Video Emotion Recognition.基于分层注意力的多模态融合网络的视频情绪识别。
Comput Intell Neurosci. 2021 Sep 25;2021:5585041. doi: 10.1155/2021/5585041. eCollection 2021.

引用本文的文献

1
Focusing on What Matters: Fine-grained Medical Activity Recognition for Trauma Resuscitation via Actor Tracking.聚焦关键因素:通过行为者跟踪实现创伤复苏的细粒度医疗活动识别
Conf Comput Vis Pattern Recognit Workshops. 2024 Jun;2024:4950-4958. doi: 10.1109/cvprw63382.2024.00500. Epub 2024 Sep 27.
2
Real-time Context-Aware Multimodal Network for Activity and Activity-Stage Recognition from Team Communication in Dynamic Clinical Settings.用于动态临床环境中团队沟通的活动及活动阶段识别的实时上下文感知多模态网络
Proc ACM Interact Mob Wearable Ubiquitous Technol. 2023 Mar;7(1). doi: 10.1145/3580798. Epub 2023 Mar 28.
3
Multi-dimensional task recognition for human-robot teaming: literature review.人机协作中的多维度任务识别:文献综述
Front Robot AI. 2023 Aug 7;10:1123374. doi: 10.3389/frobt.2023.1123374. eCollection 2023.
4
Video-based Concurrent Activity Recognition for Trauma Resuscitation.用于创伤复苏的基于视频的并发活动识别
Proc (IEEE Int Conf Healthc Inform). 2020 Nov-Dec;2020. doi: 10.1109/ichi48887.2020.9374399. Epub 2021 Mar 12.
5
Speech-Based Activity Recognition for Trauma Resuscitation.基于语音的创伤复苏活动识别
Proc (IEEE Int Conf Healthc Inform). 2020 Nov-Dec;2020. doi: 10.1109/ichi48887.2020.9374372. Epub 2021 Mar 12.

本文引用的文献

1
Human Conversation Analysis Using Attentive Multimodal Networks with Hierarchical Encoder-Decoder.使用具有分层编码器-解码器的注意力多模态网络进行人类对话分析
Proc ACM Int Conf Multimed. 2018 Oct;2018:537-545. doi: 10.1145/3240508.3240714.
2
Speech Intention Classification with Multimodal Deep Learning.基于多模态深度学习的语音意图分类
Adv Artif Intell. 2017 May;10233:260-271. doi: 10.1007/978-3-319-57351-9_30. Epub 2017 Apr 11.
3
Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment.基于词级对齐的分层注意力策略的多模态情感分析
Proc Conf Assoc Comput Linguist Meet. 2018 Jul;2018:2225-2235.
4
Hybrid Attention based Multimodal Network for Spoken Language Classification.基于混合注意力的多模态口语分类网络
Proc Conf Assoc Comput Linguist Meet. 2018 Aug;2018:2379-2390.
5
Deep Learning for RFID-Based Activity Recognition.基于射频识别的活动识别的深度学习
Proc Int Conf Embed Netw Sens Syst. 2016 Nov;2016:164-175. doi: 10.1145/2994551.2994569.
6
Activity Recognition for Medical Teamwork Based on Passive RFID.基于无源射频识别技术的医疗团队协作活动识别
IEEE Int Conf RFID. 2016 May;2016. doi: 10.1109/RFID.2016.7488002. Epub 2016 Jun 9.
7
Language-Based Process Phase Detection in the Trauma Resuscitation.创伤复苏中基于语言的过程阶段检测
Proc (IEEE Int Conf Healthc Inform). 2017 Aug;2017:239-247. doi: 10.1109/ICHI.2017.50. Epub 2017 Sep 14.
8
Online Process Phase Detection Using Multimodal Deep Learning.基于多模态深度学习的在线过程阶段检测
Ubiquitous Comput Electron Mob Commun Conf (UEMCON) IEEE Annu. 2016 Oct;2016. doi: 10.1109/UEMCON.2016.7777912. Epub 2016 Dec 12.
9
Statistical modeling and recognition of surgical workflow.手术流程的统计建模与识别。
Med Image Anal. 2012 Apr;16(3):632-41. doi: 10.1016/j.media.2010.10.001. Epub 2010 Dec 8.
10
Face recognition: a convolutional neural-network approach.人脸识别:一种卷积神经网络方法。
IEEE Trans Neural Netw. 1997;8(1):98-113. doi: 10.1109/72.554195.

用于从口语和环境声音中识别创伤活动的多模态注意力网络。

Multimodal Attention Network for Trauma Activity Recognition from Spoken Language and Environmental Sound.

作者信息

Gu Yue, Zhang Ruiyu, Zhao Xinwei, Chen Shuhong, Abdulbaqi Jalal, Marsic Ivan, Cheng Megan, Burd Randall S

机构信息

Department of Electrical and Computer Engineering, Rutgers University, Piscataway, NJ, USA.

Trauma and Burn Surgery, Childrens National Medical Center, Washington, DC, USA.

出版信息

Proc (IEEE Int Conf Healthc Inform). 2019 Jun;2019. doi: 10.1109/ichi.2019.8904713. Epub 2019 Nov 21.

DOI:10.1109/ichi.2019.8904713
PMID:32201857
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7085888/
Abstract

Trauma activity recognition aims to detect, recognize, and predict the activities (or tasks) during a trauma resuscitation. Previous work has mainly focused on using various sensor data including image, RFID, and vital signals to generate the trauma event log. However, spoken language and environmental sound, which contain rich communication and contextual information necessary for trauma team cooperation, are still largely ignored. In this paper, we propose a multimodal attention network (MAN) that uses both verbal transcripts and environmental audio stream as input; the model extracts textual and acoustic features using a multi-level multi-head attention module, and forms a final shared representation for trauma activity classification. We evaluated the proposed architecture on 75 actual trauma resuscitation cases collected from a hospital. We achieved 72.4% accuracy with 0.705 F1 score, demonstrating that our proposed architecture is useful and efficient. These results also show that using spoken language and environmental audio indeed helps identify hard-to-recognize activities, compared to previous approaches. We also provide a detailed analysis of the performance and generalization of the proposed multimodal attention network.

摘要

创伤活动识别旨在检测、识别和预测创伤复苏过程中的活动(或任务)。先前的工作主要集中在使用包括图像、射频识别和生命体征在内的各种传感器数据来生成创伤事件日志。然而,包含创伤团队协作所需丰富沟通和上下文信息的口语和环境声音在很大程度上仍被忽视。在本文中,我们提出了一种多模态注意力网络(MAN),它将口头记录和环境音频流都用作输入;该模型使用多级多头注意力模块提取文本和声学特征,并形成用于创伤活动分类的最终共享表示。我们在从一家医院收集的75个实际创伤复苏病例上评估了所提出的架构。我们实现了72.4%的准确率和0.705的F1分数,表明我们提出的架构是有用且高效的。这些结果还表明,与先前的方法相比,使用口语和环境音频确实有助于识别难以识别 的活动。我们还对所提出的多模态注意力网络的性能和泛化进行了详细分析。