结合调制谱和梅尔频率倒谱系数信息用于病理性嗓音自动检测

On combining information from modulation spectra and mel-frequency cepstral coefficients for automatic detection of pathological voices.

作者信息

Arias-Londoño Julián David, Godino-Llorente Juan I, Markaki Maria, Stylianou Yannis

机构信息

Universidad Politécnica de Madrid, Circuits & Systems Engineering, EUIT de Telecomunicación, Universidad Politécnica de Madrid, Ctra. Valencia, km 7, Madrid 28031, Spain.

出版信息

Logoped Phoniatr Vocol. 2011 Jul;36(2):60-9. doi: 10.3109/14015439.2010.528788. Epub 2010 Nov 12.

DOI:10.3109/14015439.2010.528788

PMID:21073260

Abstract

This work presents a novel approach for the automatic detection of pathological voices based on fusing the information extracted by means of mel-frequency cepstral coefficients (MFCC) and features derived from the modulation spectra (MS). The system proposed uses a two-stepped classification scheme. First, the MFCC and MS features were used to feed two different and independent classifiers; and then the outputs of each classifier were used in a second classification stage. In order to establish the best configuration which provides the highest accuracy in the detection, the fusion of information was carried out employing different classifier combination strategies. The experiments were carried out using two different databases: the one developed by The Massachusetts Eye and Ear Infirmary Voice Laboratory, and a database recorded by the Universidad Politécnica de Madrid. The results show that the combination of MFCC and MS features employing the proposed approach yields an improvement in the detection accuracy, demonstrating that both methods of parameterization are complementary.

摘要

这项工作提出了一种基于融合通过梅尔频率倒谱系数（MFCC）提取的信息和从调制谱（MS）导出的特征来自动检测病理性嗓音的新方法。所提出的系统采用两步分类方案。首先，MFCC和MS特征用于输入两个不同且独立的分类器；然后每个分类器的输出用于第二个分类阶段。为了确定在检测中提供最高准确率的最佳配置，采用不同的分类器组合策略进行信息融合。实验使用了两个不同的数据库：一个由马萨诸塞州眼耳医院嗓音实验室开发，另一个由马德里理工大学录制的数据库。结果表明，采用所提出的方法融合MFCC和MS特征可提高检测准确率，表明两种参数化方法是互补的。

相似文献

On combining information from modulation spectra and mel-frequency cepstral coefficients for automatic detection of pathological voices.

Logoped Phoniatr Vocol. 2011 Jul;36(2):60-9. doi: 10.3109/14015439.2010.528788. Epub 2010 Nov 12.

Towards objective evaluation of perceived roughness and breathiness: an approach based on mel-frequency cepstral analysis.

Logoped Phoniatr Vocol. 2011 Jul;36(2):52-9. doi: 10.3109/14015439.2010.517551. Epub 2010 Sep 17.

Multidirectional regression (MDR)-based features for automatic voice disorder detection.

J Voice. 2012 Nov;26(6):817.e19-27. doi: 10.1016/j.jvoice.2012.05.002.

Automatic detection of pathological voices using complexity measures, noise parameters, and mel-cepstral coefficients.

IEEE Trans Biomed Eng. 2011 Feb;58(2):370-9. doi: 10.1109/TBME.2010.2089052.

Discrimination between pathological and normal voices using GMM-SVM approach.

J Voice. 2011 Jan;25(1):38-43. doi: 10.1016/j.jvoice.2009.08.002. Epub 2010 Feb 4.

Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model.

J Voice. 2016 Nov;30(6):757.e7-757.e19. doi: 10.1016/j.jvoice.2015.08.010. Epub 2015 Oct 27.

Validity of jitter measures in non-quasi-periodic voices. Part I: perceptual and computer performances in cycle pattern recognition.

Logoped Phoniatr Vocol. 2011 Jul;36(2):70-7. doi: 10.3109/14015439.2011.578078.

Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors.

IEEE Trans Biomed Eng. 2004 Feb;51(2):380-4. doi: 10.1109/TBME.2003.820386.

Intra- and Inter-database Study for Arabic, English, and German Databases: Do Conventional Speech Features Detect Voice Pathology?

J Voice. 2017 May;31(3):386.e1-386.e8. doi: 10.1016/j.jvoice.2016.09.009. Epub 2016 Oct 10.

Validity of jitter measures in non-quasi-periodic voices. Part II: the effect of noise.

Logoped Phoniatr Vocol. 2011 Jul;36(2):78-89. doi: 10.3109/14015439.2011.578077. Epub 2011 May 24.

引用本文的文献

Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection.

IEEE/ACM Trans Audio Speech Lang Process. 2023;31:1348-1359. doi: 10.1109/taslp.2023.3261753. Epub 2023 Mar 28.

An Experimental Analysis on Multicepstral Projection Representation Strategies for Dysphonia Detection.

Sensors (Basel). 2023 May 30;23(11):5196. doi: 10.3390/s23115196.

The Effectiveness of Supervised Machine Learning in Screening and Diagnosing Voice Disorders: Systematic Review and Meta-analysis.

J Med Internet Res. 2022 Oct 14;24(10):e38472. doi: 10.2196/38472.

TrackUSF, a novel tool for automated ultrasonic vocalization analysis, reveals modified calls in a rat model of autism.

BMC Biol. 2022 Jul 12;20(1):159. doi: 10.1186/s12915-022-01299-y.

An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks.

Comput Math Methods Med. 2022 Apr 4;2022:7814952. doi: 10.1155/2022/7814952. eCollection 2022.

Voice Pathology Detection Using Modulation Spectrum-Optimized Metrics.

Front Bioeng Biotechnol. 2016 Jan 20;4:1. doi: 10.3389/fbioe.2016.00001. eCollection 2016.

Modulation Spectra Morphological Parameters: A New Method to Assess Voice Pathologies according to the GRBAS Scale.

Biomed Res Int. 2015;2015:259239. doi: 10.1155/2015/259239. Epub 2015 Oct 18.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

结合调制谱和梅尔频率倒谱系数信息用于病理性嗓音自动检测

On combining information from modulation spectra and mel-frequency cepstral coefficients for automatic detection of pathological voices.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献