• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于梅尔频率倒谱系数(MFCCs)和浅层卷积神经网络(CNN)的嗓音疾病检测方法。

A Voice Disease Detection Method Based on MFCCs and Shallow CNN.

作者信息

Xie Xiaoping, Cai Hao, Li Can, Wu Yu, Ding Fei

机构信息

The State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body, Hunan University, Changsha, China; Shenzhen Research Institute of Hunan University, Shenzhen, China.

The State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body, Hunan University, Changsha, China.

出版信息

J Voice. 2023 Oct 25. doi: 10.1016/j.jvoice.2023.09.024.

DOI:10.1016/j.jvoice.2023.09.024
PMID:37891129
Abstract

The incidence rate of voice diseases is increasing year by year. The use of software for remote diagnosis is a technical development trend and has important practical value. Among voice diseases, common diseases that cause hoarseness include spasmodic dysphonia, vocal cord paralysis, vocal nodule, and vocal cord polyp. This paper presents a voice disease detection method that can be applied in a wide range of clinical. We cooperated with Xiangya Hospital of Central South University to collect voice samples from 352 different patients. The Mel Frequency Cepstrum Coefficient (MFCC) parameters are extracted as input features to describe the voice in the form of data. An innovative model combining MFCC parameters and single convolution layer CNN is proposed for fast calculation and classification. The highest accuracy we achieved was 92%, it is fully ahead of the original research results and internationally advanced. And we use advanced voice function assessment databases (AVFAD) to evaluate the generalization ability of the method we proposed, which achieved an accuracy rate of 98%. Experiments on clinical and standard datasets show that for the pathological detection of voice diseases, our method has greatly improved in accuracy and computational efficiency.

摘要

嗓音疾病的发病率逐年上升。利用软件进行远程诊断是技术发展趋势,具有重要的实用价值。在嗓音疾病中,导致声音嘶哑的常见疾病包括痉挛性发声障碍、声带麻痹、声带小结和声带息肉。本文提出了一种可广泛应用于临床的嗓音疾病检测方法。我们与中南大学湘雅医院合作,收集了352名不同患者的嗓音样本。提取梅尔频率倒谱系数(MFCC)参数作为输入特征,以数据形式描述嗓音。提出了一种将MFCC参数与单卷积层卷积神经网络相结合的创新模型,用于快速计算和分类。我们实现的最高准确率为92%,完全领先于原研究结果和国际先进水平。并且我们使用先进的嗓音功能评估数据库(AVFAD)来评估我们所提出方法的泛化能力,其准确率达到了98%。在临床和标准数据集上的实验表明,对于嗓音疾病的病理检测,我们的方法在准确率和计算效率方面都有了很大提高。

相似文献

1
A Voice Disease Detection Method Based on MFCCs and Shallow CNN.一种基于梅尔频率倒谱系数(MFCCs)和浅层卷积神经网络(CNN)的嗓音疾病检测方法。
J Voice. 2023 Oct 25. doi: 10.1016/j.jvoice.2023.09.024.
2
Deep learning in automatic detection of dysphonia: Comparing acoustic features and developing a generalizable framework.深度学习在嗓音障碍自动检测中的应用:比较声学特征并开发一个可推广的框架。
Int J Lang Commun Disord. 2023 Mar;58(2):279-294. doi: 10.1111/1460-6984.12783. Epub 2022 Sep 18.
3
Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach.基于倒谱向量的病理性嗓音检测:深度学习方法。
J Voice. 2019 Sep;33(5):634-641. doi: 10.1016/j.jvoice.2018.02.003. Epub 2018 Mar 19.
4
Design and Validation of a New Diagnostic Tool for the Differentiation of Pathological Voices in Parkinsonian Patients.设计和验证一种用于帕金森病患者病理性声音鉴别诊断的新工具。
Adv Exp Med Biol. 2021;1339:77-83. doi: 10.1007/978-3-030-78787-5_11.
5
Modal and non-modal voice quality classification using acoustic and electroglottographic features.利用声学和电子声门图特征进行模态和非模态嗓音质量分类。
IEEE/ACM Trans Audio Speech Lang Process. 2017 Dec;25(12):2281-2291. doi: 10.1109/taslp.2017.2759002. Epub 2017 Nov 27.
6
Deep Learning Application for Vocal Fold Disease Prediction Through Voice Recognition: Preliminary Development Study.深度学习在声门疾病预测中的应用:通过语音识别——初步开发研究
J Med Internet Res. 2021 Jun 8;23(6):e25247. doi: 10.2196/25247.
7
mmSafe: A Voice Security Verification System Based on Millimeter-Wave Radar.mmSafe:一种基于毫米波雷达的语音安全验证系统。
Sensors (Basel). 2022 Nov 29;22(23):9309. doi: 10.3390/s22239309.
8
Major depressive disorder discrimination using vocal acoustic features.使用声音声学特征对重度抑郁症进行歧视。
J Affect Disord. 2018 Jan 1;225:214-220. doi: 10.1016/j.jad.2017.08.038. Epub 2017 Aug 16.
9
Detection of Neurogenic Voice Disorders Using the Fisher Vector Representation of Cepstral Features.使用倒谱特征的费舍尔向量表示法检测神经性嗓音障碍
J Voice. 2025 May;39(3):757-763. doi: 10.1016/j.jvoice.2022.10.016. Epub 2022 Nov 21.
10
Deep Learning-Based Cattle Vocal Classification Model and Real-Time Livestock Monitoring System with Noise Filtering.基于深度学习的牛叫声分类模型及带噪声过滤的实时牲畜监测系统
Animals (Basel). 2021 Feb 1;11(2):357. doi: 10.3390/ani11020357.

引用本文的文献

1
Research on automatic assessment of the severity of unilateral vocal cord paralysis based on Mel-spectrogram and convolutional neural networks.基于梅尔频谱图和卷积神经网络的单侧声带麻痹严重程度自动评估研究
Biomed Eng Online. 2025 Jun 21;24(1):76. doi: 10.1186/s12938-025-01401-9.