Suppr超能文献

基于梅尔频谱图和卷积神经网络的单侧声带麻痹严重程度自动评估研究

Research on automatic assessment of the severity of unilateral vocal cord paralysis based on Mel-spectrogram and convolutional neural networks.

作者信息

Ma Shuaichi, Liao Wenwen, Zhang Yi, Zhang Fan, Wang Yimiao, Lu Zhiyan, Zhao Chen, Yu Jianbo, He Peijie

机构信息

School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China.

ENT Institute and Department of Otorhinolaryngology, Eye & ENT Hospital, Fudan University, 83 Fenyang Road, Shanghai, 200031, People's Republic of China.

出版信息

Biomed Eng Online. 2025 Jun 21;24(1):76. doi: 10.1186/s12938-025-01401-9.

Abstract

BACKGROUND

This study aims to develop an AI-powered platform using Mel-spectrogram analysis and convolutional neural networks (CNN) to automate the severity assessment of unilateral vocal fold paralysis (UVCP) through voice analysis, providing an objective basis for individualized clinical treatment plans.

METHODS

To accurately identify the severity of UVCP, this study developed the CNN model TripleConvNet. Voice samples were collected from 131 healthy individuals and 292 confirmed UVCP patients from the Eye and ENT Hospital of Fudan University. Based on vocal fold compensation function, the patients were divided into three groups: decompensated (84 cases), partially compensated (98 cases), and fully compensated (110 cases). Using Mel-spectrograms and their first- and second-order differential features as inputs, the TripleConvNet model classified patients by severity and was systematically evaluated for its performance in UVCP severity grading tasks.

RESULTS

TripleConvNet achieved a classification accuracy of 74.3% in distinguishing between healthy voices and the UVCP decompensated, partially compensated, and fully compensated groups.

CONCLUSION

This study demonstrates the potential of deep learning-based non-invasive voice analysis for precise grading of UVCP severity. The proposed method offers a promising clinical tool to assist physicians in disease assessment and personalized treatment planning.

摘要

背景

本研究旨在开发一个基于人工智能的平台,利用梅尔频谱图分析和卷积神经网络(CNN),通过语音分析实现单侧声带麻痹(UVCP)严重程度评估的自动化,为个性化临床治疗方案提供客观依据。

方法

为准确识别UVCP的严重程度,本研究开发了CNN模型TripleConvNet。从复旦大学附属眼耳鼻喉科医院收集了131名健康个体和292名确诊的UVCP患者的语音样本。根据声带代偿功能,将患者分为三组:失代偿组(84例)、部分代偿组(98例)和完全代偿组(110例)。以梅尔频谱图及其一阶和二阶微分特征作为输入,TripleConvNet模型按严重程度对患者进行分类,并对其在UVCP严重程度分级任务中的性能进行系统评估。

结果

TripleConvNet在区分健康语音与UVCP失代偿组、部分代偿组和完全代偿组方面的分类准确率达到74.3%。

结论

本研究证明了基于深度学习的无创语音分析在精确分级UVCP严重程度方面的潜力。所提出的方法为协助医生进行疾病评估和个性化治疗规划提供了一种有前景的临床工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/36426c69d284/12938_2025_1401_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验