Luo Jie, Wu Yuanzhen, Liu Mengqi, Li Zhaojun, Wang Zhuo, Zheng Yi, Feng Lihui, Lu Jihua, He Fan
National Clinical Research Center for Mental Disorders, Beijing Key Laboratory of Mental Disorders, Beijing Anding Hospital, Beijing Institute for Brain Disorders Capital Medical University, De Sheng Men Wai An Kang Hu Tong 5 Hao, Xi Cheng Qu, Beijing, 100088, People's Republic of China.
Beijing Institute of Technology, School of Integrated Circuits and Electronics, Zhongguancun South Street 5 Hao, Hai Dian Qu, Beijing, 100081, China.
Child Adolesc Psychiatry Ment Health. 2024 Jan 29;18(1):19. doi: 10.1186/s13034-024-00708-0.
Major depressive disorder (MDD) and bipolar disorder (BD) are serious chronic disabling mental and emotional disorders, with symptoms that often manifest atypically in children and adolescents, making diagnosis difficult without objective physiological indicators. Therefore, we aimed to objectively identify MDD and BD in children and adolescents by exploring their voiceprint features.
This study included a total of 150 participants, with 50 MDD patients, 50 BD patients, and 50 healthy controls aged between 6 and 16 years. After collecting voiceprint data, chi-square test was used to screen and extract voiceprint features specific to emotional disorders in children and adolescents. Then, selected characteristic voiceprint features were used to establish training and testing datasets with the ratio of 7:3. The performances of various machine learning and deep learning algorithms were compared using the training dataset, and the optimal algorithm was selected to classify the testing dataset and calculate the sensitivity, specificity, accuracy, and ROC curve.
The three groups showed differences in clustering centers for various voice features such as root mean square energy, power spectral slope, low-frequency percentile energy level, high-frequency spectral slope, spectral harmonic gain, and audio signal energy level. The model of linear SVM showed the best performance in the training dataset, achieving a total accuracy of 95.6% in classifying the three groups in the testing dataset, with sensitivity of 93.3% for MDD, 100% for BD, specificity of 93.3%, AUC of 1 for BD, and AUC of 0.967 for MDD.
By exploring the characteristics of voice features in children and adolescents, machine learning can effectively differentiate between MDD and BD in a population, and voice features hold promise as an objective physiological indicator for the auxiliary diagnosis of mood disorder in clinical practice.
重度抑郁症(MDD)和双相情感障碍(BD)是严重的慢性致残性精神和情绪障碍,其症状在儿童和青少年中常表现为非典型性,若无客观生理指标则难以诊断。因此,我们旨在通过探索儿童和青少年的声纹特征来客观识别MDD和BD。
本研究共纳入150名参与者,其中50名MDD患者、50名BD患者和50名6至16岁的健康对照。收集声纹数据后,采用卡方检验筛选并提取儿童和青少年情绪障碍特有的声纹特征。然后,将选定的特征声纹特征按7:3的比例用于建立训练和测试数据集。使用训练数据集比较各种机器学习和深度学习算法的性能,选择最优算法对测试数据集进行分类,并计算敏感性、特异性、准确性和ROC曲线。
三组在均方根能量、功率谱斜率、低频百分能量水平、高频谱斜率、谱谐波增益和音频信号能量水平等各种语音特征的聚类中心上存在差异。线性支持向量机模型在训练数据集中表现最佳,在测试数据集中对三组进行分类的总准确率达到95.6%,对MDD的敏感性为93.3%,对BD的敏感性为100%,特异性为93.3%,BD的AUC为1,MDD的AUC为0.967。
通过探索儿童和青少年语音特征的特点,机器学习可以有效区分人群中的MDD和BD,语音特征有望作为临床实践中情绪障碍辅助诊断的客观生理指标。