用于日常医疗保健中模态不完整情感识别的基于大语言模型增强的多教师知识蒸馏

LLM-Enhanced Multi-Teacher Knowledge Distillation for Modality-Incomplete Emotion Recognition in Daily Healthcare.

作者信息

Zhang Yuzhe, Liu Huan, Xiao Yang, Amoon Mohammed, Zhang Dalin, Wang Di, Yang Shusen, Quek Chai

出版信息

IEEE J Biomed Health Inform. 2024 Sep 30;PP. doi: 10.1109/JBHI.2024.3470338.

DOI:10.1109/JBHI.2024.3470338

Abstract

The critical importance of monitoring and recognizing human emotional states in healthcare has led to a surge in proposals for EEG-based multimodal emotion recognition in recent years. However, practical challenges arise in acquiring EEG signals in daily healthcare settings due to stringent data acquisition conditions, resulting in the issue of incomplete modalities. Existing studies have turned to knowledge distillation as a means to mitigate this problem by transferring knowledge from multimodal networks to unimodal ones. However, these methods are constrained by the use of a single teacher model to transfer integrated feature extraction knowledge, particularly concerning spatial and temporal features in EEG data. To address this limitation, we propose a multi-teacher knowledge distillation framework enhanced with a Large Language Model (LLM), aimed at facilitating effective feature learning in the student network by transferring knowledge of extracting integrated features. Specifically, we employ an LLM as the teacher for extracting temporal features and a graph convolutional neural network for extracting spatial features. To further enhance knowledge distillation, we introduce causal masking and a confidence indicator into the LLM to facilitate the transfer of the most discriminative features. Extensive testing on the DEAP and MAHNOB-HCI datasets demonstrates that our model outperforms existing methods in the modality-incomplete scenario. This study underscores the potential application of large models in this field. The code is publicly available at https://github.com/yuzhezhangEEG/LM-KD.

摘要

近年来，在医疗保健中监测和识别人类情绪状态的至关重要性，促使基于脑电图（EEG）的多模态情绪识别提议激增。然而，由于严格的数据采集条件，在日常医疗环境中获取EEG信号时会出现实际挑战，导致模态不完整的问题。现有研究已转向知识蒸馏，作为通过将知识从多模态网络转移到单模态网络来缓解此问题的一种手段。然而，这些方法受到使用单个教师模型来转移集成特征提取知识的限制，特别是关于EEG数据中的空间和时间特征。为了解决这一限制，我们提出了一种由大语言模型（LLM）增强的多教师知识蒸馏框架，旨在通过转移提取集成特征的知识来促进学生网络中的有效特征学习。具体而言，我们使用LLM作为提取时间特征的教师，使用图卷积神经网络提取空间特征。为了进一步增强知识蒸馏，我们在LLM中引入因果掩码和置信度指标，以促进最具判别力特征的转移。在DEAP和MAHNOB-HCI数据集上进行的广泛测试表明，我们的模型在模态不完整的情况下优于现有方法。本研究强调了大模型在该领域的潜在应用。代码可在https://github.com/yuzhezhangEEG/LM-KD上公开获取。