基于多模态RSVP的目标检测的跨模态引导与重加权网络

Cross-modal guiding and reweighting network for multi-modal RSVP-based target detection.

作者信息

Mao Jiayu, Qiu Shuang, Wei Wei, He Huiguang

机构信息

Laboratory of Brain Atlas and Brain-Inspired Intelligence, State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China.

Laboratory of Brain Atlas and Brain-Inspired Intelligence, State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China.

出版信息

Neural Netw. 2023 Apr;161:65-82. doi: 10.1016/j.neunet.2023.01.009. Epub 2023 Jan 16.

DOI:10.1016/j.neunet.2023.01.009

PMID:36736001

Abstract

Rapid Serial Visual Presentation (RSVP) based Brain-Computer Interface (BCI) facilities the high-throughput detection of rare target images by detecting evoked event-related potentials (ERPs). At present, the decoding accuracy of the RSVP-based BCI system limits its practical applications. This study introduces eye movements (gaze and pupil information), referred to as EYE modality, as another useful source of information to combine with EEG-based BCI and forms a novel target detection system to detect target images in RSVP tasks. We performed an RSVP experiment, recorded the EEG signals and eye movements simultaneously during a target detection task, and constructed a multi-modal dataset including 20 subjects. Also, we proposed a cross-modal guiding and fusion network to fully utilize EEG and EYE modalities and fuse them for better RSVP decoding performance. In this network, a two-branch backbone was built to extract features from these two modalities. A Cross-Modal Feature Guiding (CMFG) module was proposed to guide EYE modality features to complement the EEG modality for better feature extraction. A Multi-scale Multi-modal Reweighting (MMR) module was proposed to enhance the multi-modal features by exploring intra- and inter-modal interactions. And, a Dual Activation Fusion (DAF) was proposed to modulate the enhanced multi-modal features for effective fusion. Our proposed network achieved a balanced accuracy of 88.00% (±2.29) on the collected dataset. The ablation studies and visualizations revealed the effectiveness of the proposed modules. This work implies the effectiveness of introducing the EYE modality in RSVP tasks. And, our proposed network is a promising method for RSVP decoding and further improves the performance of RSVP-based target detection systems.

摘要

基于快速序列视觉呈现（RSVP）的脑机接口（BCI）通过检测诱发的事件相关电位（ERP）来实现对罕见目标图像的高通量检测。目前，基于RSVP的BCI系统的解码精度限制了其实际应用。本研究引入眼动（注视和瞳孔信息），称为EYE模态，作为另一个有用的信息源与基于脑电图的BCI相结合，并形成一种新颖的目标检测系统，以检测RSVP任务中的目标图像。我们进行了一项RSVP实验，在目标检测任务期间同时记录脑电图信号和眼动，并构建了一个包含20名受试者的多模态数据集。此外，我们提出了一种跨模态引导与融合网络，以充分利用脑电图和EYE模态，并将它们融合以获得更好的RSVP解码性能。在这个网络中，构建了一个双分支主干来从这两种模态中提取特征。提出了一种跨模态特征引导（CMFG）模块来引导EYE模态特征以补充脑电图模态，从而实现更好的特征提取。提出了一种多尺度多模态重加权（MMR）模块，通过探索模态内和模态间的相互作用来增强多模态特征。并且，提出了一种双激活融合（DAF）来调制增强后的多模态特征以进行有效融合。我们提出的网络在收集的数据集上实现了88.00%（±2.29）的平衡准确率。消融研究和可视化结果揭示了所提出模块的有效性。这项工作表明在RSVP任务中引入EYE模态的有效性。并且，我们提出的网络是一种有前途的RSVP解码方法，进一步提高了基于RSVP的目标检测系统的性能。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于多模态RSVP的目标检测的跨模态引导与重加权网络

Cross-modal guiding and reweighting network for multi-modal RSVP-based target detection.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

基于多模态RSVP的目标检测的跨模态引导与重加权网络

Cross-modal guiding and reweighting network for multi-modal RSVP-based target detection.

作者信息

机构信息

出版信息

相似文献

引用本文的文献