文献检索，用中文搜 PubMed

BACKGROUND

This study aims to develop an AI-powered platform using Mel-spectrogram analysis and convolutional neural networks (CNN) to automate the severity assessment of unilateral vocal fold paralysis (UVCP) through voice analysis, providing an objective basis for individualized clinical treatment plans.

METHODS

To accurately identify the severity of UVCP, this study developed the CNN model TripleConvNet. Voice samples were collected from 131 healthy individuals and 292 confirmed UVCP patients from the Eye and ENT Hospital of Fudan University. Based on vocal fold compensation function, the patients were divided into three groups: decompensated (84 cases), partially compensated (98 cases), and fully compensated (110 cases). Using Mel-spectrograms and their first- and second-order differential features as inputs, the TripleConvNet model classified patients by severity and was systematically evaluated for its performance in UVCP severity grading tasks.

RESULTS

TripleConvNet achieved a classification accuracy of 74.3% in distinguishing between healthy voices and the UVCP decompensated, partially compensated, and fully compensated groups.

CONCLUSION

This study demonstrates the potential of deep learning-based non-invasive voice analysis for precise grading of UVCP severity. The proposed method offers a promising clinical tool to assist physicians in disease assessment and personalized treatment planning.

Research on automatic assessment of the severity of unilateral vocal cord paralysis based on Mel-spectrogram and convolutional neural networks.

作者信息

Ma Shuaichi, Liao Wenwen, Zhang Yi, Zhang Fan, Wang Yimiao, Lu Zhiyan, Zhao Chen, Yu Jianbo, He Peijie

机构信息

School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China.

ENT Institute and Department of Otorhinolaryngology, Eye & ENT Hospital, Fudan University, 83 Fenyang Road, Shanghai, 200031, People's Republic of China.

出版信息

Biomed Eng Online. 2025 Jun 21;24(1):76. doi: 10.1186/s12938-025-01401-9.

背景

本研究旨在开发一个基于人工智能的平台，利用梅尔频谱图分析和卷积神经网络（CNN），通过语音分析实现单侧声带麻痹（UVCP）严重程度评估的自动化，为个性化临床治疗方案提供客观依据。

方法

为准确识别UVCP的严重程度，本研究开发了CNN模型TripleConvNet。从复旦大学附属眼耳鼻喉科医院收集了131名健康个体和292名确诊的UVCP患者的语音样本。根据声带代偿功能，将患者分为三组：失代偿组（84例）、部分代偿组（98例）和完全代偿组（110例）。以梅尔频谱图及其一阶和二阶微分特征作为输入，TripleConvNet模型按严重程度对患者进行分类，并对其在UVCP严重程度分级任务中的性能进行系统评估。