一种使用情感识别进行精神障碍检测的混合学习架构

A Hybrid Learning-Architecture for Mental Disorder Detection Using Emotion Recognition.

作者信息

Aina Joseph, Akinniyi Oluwatunmise, Rahman Md Mahmudur, Odero-Marah Valerie, Khalifa Fahmi

机构信息

Electrical and Computer Engineering Department, School of Engineering, Morgan State University, Baltimore, MD 21251, USA.

Department of Computer Science, School of Computer, Mathematical and Natural Sciences, Morgan State University, Baltimore, MD 21251, USA.

出版信息

IEEE Access. 2024;12:91410-91425. doi: 10.1109/access.2024.3421376. Epub 2024 Jul 1.

DOI:10.1109/access.2024.3421376

PMID:39054996

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11270886/

Abstract

Mental illness has grown to become a prevalent and global health concern that affects individuals across various demographics. Timely detection and accurate diagnosis of mental disorders are crucial for effective treatment and support as late diagnosis could result in suicidal, harmful behaviors and ultimately death. To this end, the present study introduces a novel pipeline for the analysis of facial expressions, leveraging both the AffectNet and 2013 Facial Emotion Recognition (FER) datasets. Consequently, this research goes beyond traditional diagnostic methods by contributing a system capable of generating a comprehensive mental disorder dataset and concurrently predicting mental disorders based on facial emotional cues. Particularly, we introduce a hybrid architecture for mental disorder detection leveraging the state-of-the-art object detection algorithm, YOLOv8 to detect and classify visual cues associated with specific mental disorders. To achieve accurate predictions, an integrated learning architecture based on the fusion of Convolution Neural Networks (CNNs) and Visual Transformer (ViT) models is developed to form an ensemble classifier that predicts the presence of mental illness (e.g., depression, anxiety, and other mental disorder). The overall accuracy is improved to about 81% using the proposed ensemble technique. To ensure transparency and interpretability, we integrate techniques such as Gradient-weighted Class Activation Mapping (Grad-CAM) and saliency maps to highlight the regions in the input image that significantly contribute to the model's predictions thus providing healthcare professionals with a clear understanding of the features influencing the system's decisions thereby enhancing trust and more informed diagnostic process.

摘要

精神疾病已发展成为一个普遍且全球性的健康问题，影响着不同人口统计学特征的个体。及时发现和准确诊断精神障碍对于有效治疗和支持至关重要，因为延迟诊断可能导致自杀、有害行为并最终导致死亡。为此，本研究引入了一种用于面部表情分析的新型管道，利用了AffectNet和2013年面部表情识别（FER）数据集。因此，本研究超越了传统诊断方法，贡献了一个能够生成全面精神障碍数据集并同时基于面部情绪线索预测精神障碍的系统。特别是，我们引入了一种用于精神障碍检测的混合架构，利用先进的目标检测算法YOLOv8来检测和分类与特定精神障碍相关的视觉线索。为了实现准确预测，开发了一种基于卷积神经网络（CNN）和视觉Transformer（ViT）模型融合的集成学习架构，以形成一个预测精神疾病（如抑郁症、焦虑症和其他精神障碍）存在的集成分类器。使用所提出的集成技术，总体准确率提高到了约81%。为了确保透明度和可解释性，我们集成了诸如梯度加权类激活映射（Grad-CAM）和显著性图等技术，以突出输入图像中对模型预测有显著贡献的区域，从而使医疗专业人员清楚了解影响系统决策的特征，进而增强信任并使诊断过程更具信息性。