用于使用音频和脑电图识别重度抑郁症的跨孤岛、隐私保护且轻量级的联邦多模态系统

Cross-Silo, Privacy-Preserving, and Lightweight Federated Multimodal System for the Identification of Major Depressive Disorder Using Audio and Electroencephalogram.

作者信息

Gupta Chetna, Khullar Vikas, Goyal Nitin, Saini Kirti, Baniwal Ritu, Kumar Sushil, Rastogi Rashi

机构信息

Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura 140417, Punjab, India.

Department of Computer Science and Engineering, School of Engineering and Technology, Central University of Haryana, Mahendergarh 123031, Haryana, India.

出版信息

Diagnostics (Basel). 2023 Dec 25;14(1):43. doi: 10.3390/diagnostics14010043.

DOI:10.3390/diagnostics14010043

PMID:38201350

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10795654/

Abstract

In this day and age, depression is still one of the biggest problems in the world. If left untreated, it can lead to suicidal thoughts and attempts. There is a need for proper diagnoses of Major Depressive Disorder (MDD) and evaluation of the early stages to stop the side effects. Early detection is critical to identify a variety of serious conditions. In order to provide safe and effective protection to MDD patients, it is crucial to automate diagnoses and make decision-making tools widely available. Although there are various classification systems for the diagnosis of MDD, no reliable, secure method that meets these requirements has been established to date. In this paper, a federated deep learning-based multimodal system for MDD classification using electroencephalography (EEG) and audio datasets is presented while meeting data privacy requirements. The performance of the federated learning (FL) model was tested on independent and identically distributed (IID) and non-IID data. The study began by extracting features from several pre-trained models and ultimately decided to use bidirectional short-term memory (Bi-LSTM) as the base model, as it had the highest validation accuracy of 91% compared to a convolutional neural network and LSTM with 85% and 89% validation accuracy on audio data, respectively. The Bi-LSTM model also achieved a validation accuracy of 98.9% for EEG data. The FL method was then used to perform experiments on IID and non-IID datasets. The FL-based multimodal model achieved an exceptional training and validation accuracy of 99.9% when trained and evaluated on both IID and non-IIID datasets. These results show that the FL multimodal system performs almost as well as the Bi-LSTM multimodal system and emphasize its suitability for processing IID and non-IIID data. Several clients were found to perform better than conventional pre-trained models in a multimodal framework for federated learning using EEG and audio datasets. The proposed framework stands out from other classification techniques for MDD due to its special features, such as multimodality and data privacy for edge machines with limited resources. Due to these additional features, the framework concept is the most suitable alternative approach for the early classification of MDD patients.

摘要

在当今时代，抑郁症仍是世界上最严重的问题之一。若不加以治疗，可能会导致自杀念头和自杀行为。需要对重度抑郁症（MDD）进行准确诊断并评估早期阶段，以阻止其产生副作用。早期检测对于识别各种严重病症至关重要。为了给MDD患者提供安全有效的保护，实现诊断自动化并广泛提供决策工具至关重要。尽管有多种用于诊断MDD的分类系统，但迄今为止尚未建立满足这些要求的可靠、安全的方法。本文提出了一种基于联邦深度学习的多模态系统，用于使用脑电图（EEG）和音频数据集进行MDD分类，同时满足数据隐私要求。在独立同分布（IID）和非IID数据上测试了联邦学习（FL）模型的性能。该研究首先从几个预训练模型中提取特征，最终决定使用双向短期记忆（Bi-LSTM）作为基础模型，因为与卷积神经网络和LSTM相比，它在音频数据上的验证准确率最高，分别为91%、85%和89%。Bi-LSTM模型在EEG数据上的验证准确率也达到了98.9%。然后使用FL方法在IID和非IID数据集上进行实验。基于FL的多模态模型在IID和非IID数据集上进行训练和评估时，实现了99.9%的卓越训练和验证准确率。这些结果表明，FL多模态系统的性能几乎与Bi-LSTM多模态系统一样好，并强调了其处理IID和非IID数据的适用性。在使用EEG和音频数据集进行联邦学习的多模态框架中，发现几个客户端的表现优于传统的预训练模型。所提出的框架因其多模态和为资源有限的边缘机器提供数据隐私等特殊特性，在MDD的其他分类技术中脱颖而出。由于这些附加特性，该框架概念是MDD患者早期分类最合适的替代方法。