• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于梅尔频谱图和卷积神经网络的单侧声带麻痹严重程度自动评估研究

Research on automatic assessment of the severity of unilateral vocal cord paralysis based on Mel-spectrogram and convolutional neural networks.

作者信息

Ma Shuaichi, Liao Wenwen, Zhang Yi, Zhang Fan, Wang Yimiao, Lu Zhiyan, Zhao Chen, Yu Jianbo, He Peijie

机构信息

School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai, 200093, China.

ENT Institute and Department of Otorhinolaryngology, Eye & ENT Hospital, Fudan University, 83 Fenyang Road, Shanghai, 200031, People's Republic of China.

出版信息

Biomed Eng Online. 2025 Jun 21;24(1):76. doi: 10.1186/s12938-025-01401-9.

DOI:10.1186/s12938-025-01401-9
PMID:40544236
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12181906/
Abstract

BACKGROUND

This study aims to develop an AI-powered platform using Mel-spectrogram analysis and convolutional neural networks (CNN) to automate the severity assessment of unilateral vocal fold paralysis (UVCP) through voice analysis, providing an objective basis for individualized clinical treatment plans.

METHODS

To accurately identify the severity of UVCP, this study developed the CNN model TripleConvNet. Voice samples were collected from 131 healthy individuals and 292 confirmed UVCP patients from the Eye and ENT Hospital of Fudan University. Based on vocal fold compensation function, the patients were divided into three groups: decompensated (84 cases), partially compensated (98 cases), and fully compensated (110 cases). Using Mel-spectrograms and their first- and second-order differential features as inputs, the TripleConvNet model classified patients by severity and was systematically evaluated for its performance in UVCP severity grading tasks.

RESULTS

TripleConvNet achieved a classification accuracy of 74.3% in distinguishing between healthy voices and the UVCP decompensated, partially compensated, and fully compensated groups.

CONCLUSION

This study demonstrates the potential of deep learning-based non-invasive voice analysis for precise grading of UVCP severity. The proposed method offers a promising clinical tool to assist physicians in disease assessment and personalized treatment planning.

摘要

背景

本研究旨在开发一个基于人工智能的平台,利用梅尔频谱图分析和卷积神经网络(CNN),通过语音分析实现单侧声带麻痹(UVCP)严重程度评估的自动化,为个性化临床治疗方案提供客观依据。

方法

为准确识别UVCP的严重程度,本研究开发了CNN模型TripleConvNet。从复旦大学附属眼耳鼻喉科医院收集了131名健康个体和292名确诊的UVCP患者的语音样本。根据声带代偿功能,将患者分为三组:失代偿组(84例)、部分代偿组(98例)和完全代偿组(110例)。以梅尔频谱图及其一阶和二阶微分特征作为输入,TripleConvNet模型按严重程度对患者进行分类,并对其在UVCP严重程度分级任务中的性能进行系统评估。

结果

TripleConvNet在区分健康语音与UVCP失代偿组、部分代偿组和完全代偿组方面的分类准确率达到74.3%。

结论

本研究证明了基于深度学习的无创语音分析在精确分级UVCP严重程度方面的潜力。所提出的方法为协助医生进行疾病评估和个性化治疗规划提供了一种有前景的临床工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/b7fabc118cb2/12938_2025_1401_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/36426c69d284/12938_2025_1401_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/ae4e475955a0/12938_2025_1401_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/cd9578effac6/12938_2025_1401_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/03a9ac6909f5/12938_2025_1401_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/106e6bd9724d/12938_2025_1401_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/1ff28e29dded/12938_2025_1401_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/c3e25ffbbb03/12938_2025_1401_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/b7fabc118cb2/12938_2025_1401_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/36426c69d284/12938_2025_1401_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/ae4e475955a0/12938_2025_1401_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/cd9578effac6/12938_2025_1401_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/03a9ac6909f5/12938_2025_1401_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/106e6bd9724d/12938_2025_1401_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/1ff28e29dded/12938_2025_1401_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/c3e25ffbbb03/12938_2025_1401_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fd4d/12181906/b7fabc118cb2/12938_2025_1401_Fig8_HTML.jpg

相似文献

1
Research on automatic assessment of the severity of unilateral vocal cord paralysis based on Mel-spectrogram and convolutional neural networks.基于梅尔频谱图和卷积神经网络的单侧声带麻痹严重程度自动评估研究
Biomed Eng Online. 2025 Jun 21;24(1):76. doi: 10.1186/s12938-025-01401-9.
2
A deep learning approach to direct immunofluorescence pattern recognition in autoimmune bullous diseases.深度学习方法在自身免疫性大疱性疾病中的直接免疫荧光模式识别。
Br J Dermatol. 2024 Jul 16;191(2):261-266. doi: 10.1093/bjd/ljae142.
3
Skin-CAD: Explainable deep learning classification of skin cancer from dermoscopic images by feature selection of dual high-level CNNs features and transfer learning.皮肤 CAD:基于双高级 CNN 特征选择和迁移学习的皮肤镜图像皮肤癌可解释深度学习分类。
Comput Biol Med. 2024 Aug;178:108798. doi: 10.1016/j.compbiomed.2024.108798. Epub 2024 Jun 25.
4
Deep learning detects retropharyngeal edema on MRI in patients with acute neck infections.深度学习可检测急性颈部感染患者MRI上的咽后水肿。
Eur Radiol Exp. 2025 Jun 19;9(1):60. doi: 10.1186/s41747-025-00599-6.
5
Application Value of Deep Learning-Based AI Model in the Classification of Breast Nodules.基于深度学习的人工智能模型在乳腺结节分类中的应用价值
Br J Hosp Med (Lond). 2025 Jun 25;86(6):1-19. doi: 10.12968/hmed.2025.0078. Epub 2025 Jun 15.
6
Comparative analysis of convolutional neural networks and traditional machine learning models for IVF live birth prediction: a retrospective analysis of 48514 IVF cycles and an evaluation of deployment feasibility in resource-constrained settings.用于体外受精活产预测的卷积神经网络与传统机器学习模型的比较分析:对48514个体外受精周期的回顾性分析及在资源受限环境中的部署可行性评估
Front Endocrinol (Lausanne). 2025 Jun 12;16:1556681. doi: 10.3389/fendo.2025.1556681. eCollection 2025.
7
Predicting cognitive decline: Deep-learning reveals subtle brain changes in pre-MCI stage.预测认知衰退:深度学习揭示轻度认知障碍前阶段大脑的细微变化。
J Prev Alzheimers Dis. 2025 May;12(5):100079. doi: 10.1016/j.tjpad.2025.100079. Epub 2025 Feb 6.
8
Integrating Patient Data Into Skin Cancer Classification Using Convolutional Neural Networks: Systematic Review.利用卷积神经网络将患者数据整合到皮肤癌分类中:系统评价。
J Med Internet Res. 2021 Jul 2;23(7):e20708. doi: 10.2196/20708.
9
Machine learning based assessment of hoarseness severity: a multi-sensor approach centered on high-speed videoendoscopy.基于机器学习的声音嘶哑严重程度评估:一种以高速视频内镜检查为核心的多传感器方法。
Front Artif Intell. 2025 Jun 5;8:1601716. doi: 10.3389/frai.2025.1601716. eCollection 2025.
10
Effective Feature Extraction for Knee Osteoarthritis Detection on X-ray Images Using Convolutional Neural Networks.基于卷积神经网络的X射线图像中膝关节骨关节炎检测的有效特征提取
Curr Med Imaging. 2025 Jun 20. doi: 10.2174/0115734056360714250612080450.

本文引用的文献

1
Comparative Utility of Voice Symptom Scale, Voice Handicap Index, and GRBAS in Assessing Voice Disorders: A Clinical Study.嗓音症状量表、嗓音障碍指数和GRBAS在评估嗓音障碍中的比较效用:一项临床研究
Indian J Otolaryngol Head Neck Surg. 2025 Mar;77(3):1372-1377. doi: 10.1007/s12070-025-05346-2. Epub 2025 Jan 18.
2
Identifying bias in models that detect vocal fold paralysis from audio recordings using explainable machine learning and clinician ratings.使用可解释机器学习和临床医生评级来识别从音频记录中检测声带麻痹的模型中的偏差。
PLOS Digit Health. 2024 May 30;3(5):e0000516. doi: 10.1371/journal.pdig.0000516. eCollection 2024 May.
3
Classification of laryngeal diseases including laryngeal cancer, benign mucosal disease, and vocal cord paralysis by artificial intelligence using voice analysis.
利用语音分析通过人工智能对包括喉癌、良性黏膜疾病和声带麻痹在内的喉部疾病进行分类。
Sci Rep. 2024 Apr 23;14(1):9297. doi: 10.1038/s41598-024-58817-x.
4
A Voice Disease Detection Method Based on MFCCs and Shallow CNN.一种基于梅尔频率倒谱系数(MFCCs)和浅层卷积神经网络(CNN)的嗓音疾病检测方法。
J Voice. 2023 Oct 25. doi: 10.1016/j.jvoice.2023.09.024.
5
Voice disorder classification using convolutional neural network based on deep transfer learning.基于深度迁移学习的卷积神经网络语音障碍分类。
Sci Rep. 2023 May 4;13(1):7264. doi: 10.1038/s41598-023-34461-9.
6
Deep learning in automatic detection of dysphonia: Comparing acoustic features and developing a generalizable framework.深度学习在嗓音障碍自动检测中的应用:比较声学特征并开发一个可推广的框架。
Int J Lang Commun Disord. 2023 Mar;58(2):279-294. doi: 10.1111/1460-6984.12783. Epub 2022 Sep 18.
7
Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions.深度学习:关于技术、分类法、应用及研究方向的全面综述
SN Comput Sci. 2021;2(6):420. doi: 10.1007/s42979-021-00815-1. Epub 2021 Aug 18.
8
Deep Learning Application for Vocal Fold Disease Prediction Through Voice Recognition: Preliminary Development Study.深度学习在声门疾病预测中的应用:通过语音识别——初步开发研究
J Med Internet Res. 2021 Jun 8;23(6):e25247. doi: 10.2196/25247.
9
Comparative Analysis of CNN and RNN for Voice Pathology Detection.卷积神经网络(CNN)和循环神经网络(RNN)在语音病理学检测中的比较分析。
Biomed Res Int. 2021 Apr 14;2021:6635964. doi: 10.1155/2021/6635964. eCollection 2021.
10
Convolutional Neural Network Classifies Pathological Voice Change in Laryngeal Cancer with High Accuracy.卷积神经网络可高精度地对喉癌中的病理性声音变化进行分类。
J Clin Med. 2020 Oct 25;9(11):3415. doi: 10.3390/jcm9113415.