多倒谱投影表示策略在嗓音障碍检测中的实验分析。

An Experimental Analysis on Multicepstral Projection Representation Strategies for Dysphonia Detection.

机构信息

Department of Computer Science and Statistics, Institute of Biosciences, Letters and Exact Sciences, São Paulo State University, São José do Rio Preto 15054-000, SP, Brazil.

Federal Institute of São Paulo, São José do Rio Preto 15030-070, SP, Brazil.

出版信息

Sensors (Basel). 2023 May 30;23(11):5196. doi: 10.3390/s23115196.

DOI:10.3390/s23115196

PMID:37299922

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10256083/

Abstract

Biometrics-based authentication has become the most well-established form of user recognition in systems that demand a certain level of security. For example, the most commonplace social activities stand out, such as access to the work environment or to one's own bank account. Among all biometrics, voice receives special attention due to factors such as ease of collection, the low cost of reading devices, and the high quantity of literature and software packages available for use. However, these biometrics may have the ability to represent the individual impaired by the phenomenon known as dysphonia, which consists of a change in the sound signal due to some disease that acts on the vocal apparatus. As a consequence, for example, a user with the flu may not be properly authenticated by the recognition system. Therefore, it is important that automatic voice dysphonia detection techniques be developed. In this work, we propose a new framework based on the representation of the voice signal by the multiple projection of cepstral coefficients to promote the detection of dysphonic alterations in the voice through machine learning techniques. Most of the best-known cepstral coefficient extraction techniques in the literature are mapped and analyzed separately and together with measures related to the fundamental frequency of the voice signal, and its representation capacity is evaluated on three classifiers. Finally, the experiments on a subset of the Saarbruecken Voice Database prove the effectiveness of the proposed material in detecting the presence of dysphonia in the voice.

摘要

基于生物特征的认证已经成为在需要一定安全级别的系统中最成熟的用户识别形式。例如，最常见的社交活动就很突出，例如访问工作环境或自己的银行账户。在所有生物特征中，由于采集方便、读取设备成本低以及可用于使用的文献和软件包数量众多等因素，语音受到特别关注。然而，这些生物特征可能能够代表由于称为发声障碍的现象而受损的个体，发声障碍是由于作用于发声器官的某种疾病而导致声音信号发生变化。因此，例如，患有流感的用户可能无法被识别系统正确认证。因此，开发自动语音发声障碍检测技术非常重要。在这项工作中，我们提出了一种新的框架，该框架基于通过复倒谱系数的多次投影来表示语音信号，以通过机器学习技术促进检测语音中的发声障碍变化。文献中最著名的大多数复倒谱系数提取技术被分别映射和分析，并与与语音信号的基频及其表示能力相关的措施一起进行分析，并在三个分类器上评估其表示能力。最后，Saarbruecken 语音数据库的一个子集上的实验证明了所提出的材料在检测语音中发声障碍方面的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84b4/10256083/af27f0304d9b/sensors-23-05196-g001.jpg

相似文献

An Experimental Analysis on Multicepstral Projection Representation Strategies for Dysphonia Detection.多倒谱投影表示策略在嗓音障碍检测中的实验分析。

Sensors (Basel). 2023 May 30;23(11):5196. doi: 10.3390/s23115196.

Deep learning in automatic detection of dysphonia: Comparing acoustic features and developing a generalizable framework.深度学习在嗓音障碍自动检测中的应用：比较声学特征并开发一个可推广的框架。

Int J Lang Commun Disord. 2023 Mar;58(2):279-294. doi: 10.1111/1460-6984.12783. Epub 2022 Sep 18.

Use of cepstral analysis for differentiating dysphonic from normal voices in children.运用倒谱分析鉴别儿童嗓音障碍与正常嗓音。

Int J Pediatr Otorhinolaryngol. 2019 Jan;116:107-113. doi: 10.1016/j.ijporl.2018.10.029. Epub 2018 Oct 23.

Predictive value and discriminant capacity of cepstral- and spectral-based measures during continuous speech.基于倒谱和谱的语音连续语音分析的预测价值和判别能力。

J Voice. 2013 Jul;27(4):393-400. doi: 10.1016/j.jvoice.2013.02.005. Epub 2013 May 16.

Predicting Voice Disorder Status From Smoothed Measures of Cepstral Peak Prominence Using Praat and Analysis of Dysphonia in Speech and Voice (ADSV).使用Praat通过平滑的谐波峰值突出度测量以及言语和嗓音中的发声障碍分析（ADSV）预测嗓音障碍状态。

J Voice. 2017 Sep;31(5):557-566. doi: 10.1016/j.jvoice.2017.01.006. Epub 2017 Feb 4.

Validation of Cepstral Acoustic Analysis for Normal and Pathological Voice in the Japanese Language.基于倒谱分析的日语正常嗓音与病理嗓音的验证。

J Voice. 2022 Nov;36(6):770-776. doi: 10.1016/j.jvoice.2020.08.026. Epub 2020 Sep 18.

Pitch Strength as an Outcome Measure for Treatment of Dysphonia.音调强度作为嗓音障碍治疗的一项疗效指标。

J Voice. 2017 Nov;31(6):691-696. doi: 10.1016/j.jvoice.2017.01.016. Epub 2017 Mar 17.

Detection of Pathological Voice Using Cepstrum Vectors: A Deep Learning Approach.基于倒谱向量的病理性嗓音检测：深度学习方法。

J Voice. 2019 Sep;33(5):634-641. doi: 10.1016/j.jvoice.2018.02.003. Epub 2018 Mar 19.

Validation of the Cepstral Spectral Index of Dysphonia (CSID) as a Screening Tool for Voice Disorders: Development of Clinical Cutoff Scores.嗓音障碍的谐波倒谱谱指数（CSID）作为嗓音疾病筛查工具的验证：临床临界值的制定

J Voice. 2016 Mar;30(2):130-44. doi: 10.1016/j.jvoice.2015.04.009. Epub 2015 Sep 8.

Acoustic and Perceptual Classification of Within-sample Normal, Intermittently Dysphonic, and Consistently Dysphonic Voice Types.样本内正常、间歇性发声障碍和持续性发声障碍嗓音类型的声学及感知分类

J Voice. 2017 Mar;31(2):218-228. doi: 10.1016/j.jvoice.2016.04.016. Epub 2016 May 27.

引用本文的文献

Improving Voice Spoofing Detection Through Extensive Analysis of Multicepstral Feature Reduction.通过对多倒谱特征约简的广泛分析改进语音欺骗检测

Sensors (Basel). 2025 Aug 5;25(15):4821. doi: 10.3390/s25154821.

本文引用的文献

Artificial Intelligence-Enabled End-To-End Detection and Assessment of Alzheimer's Disease Using Voice.利用语音实现的人工智能端到端阿尔茨海默病检测与评估

Brain Sci. 2022 Dec 23;13(1):28. doi: 10.3390/brainsci13010028.

Employing Energy and Statistical Features for Automatic Diagnosis of Voice Disorders.利用能量和统计特征进行语音障碍的自动诊断。

Diagnostics (Basel). 2022 Nov 11;12(11):2758. doi: 10.3390/diagnostics12112758.

The Effectiveness of Supervised Machine Learning in Screening and Diagnosing Voice Disorders: Systematic Review and Meta-analysis.监督机器学习在筛查和诊断嗓音障碍中的有效性：系统评价和荟萃分析。

J Med Internet Res. 2022 Oct 14;24(10):e38472. doi: 10.2196/38472.

Computerized analysis of speech and voice for Parkinson's disease: A systematic review.计算机化的语音和嗓音分析用于帕金森病：系统评价。

Comput Methods Programs Biomed. 2022 Nov;226:107133. doi: 10.1016/j.cmpb.2022.107133. Epub 2022 Sep 16.

An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks.基于 MFCC 和深度神经网络的语音病理学检测分析研究。

Comput Math Methods Med. 2022 Apr 4;2022:7814952. doi: 10.1155/2022/7814952. eCollection 2022.

The usefulness of multi voice evaluation: Development of a model for predicting a degree of dysphonia.多语音评估的实用性：一种预测嗓音障碍程度模型的开发

J Voice. 2023 Jan;37(1):142.e5-142.e12. doi: 10.1016/j.jvoice.2020.10.020. Epub 2020 Nov 14.

The Effect of Pitch and Loudness Auditory Feedback Perturbations on Vocal Quality During Sustained Phonation.音高和响度听觉反馈扰动对持续发声时嗓音质量的影响。

J Voice. 2023 Jan;37(1):37-47. doi: 10.1016/j.jvoice.2020.11.001. Epub 2020 Nov 13.

Hey Siri: How Effective are Common Voice Recognition Systems at Recognizing Dysphonic Voices?嘿，Siri：常见语音识别系统在识别嗓音障碍者的声音方面效果如何？

Laryngoscope. 2021 Jul;131(7):1599-1607. doi: 10.1002/lary.29082. Epub 2020 Sep 19.

Peering Into the Black Box of Artificial Intelligence: Evaluation Metrics of Machine Learning Methods.窥视人工智能的黑箱：机器学习方法的评估指标。

AJR Am J Roentgenol. 2019 Jan;212(1):38-43. doi: 10.2214/AJR.18.20224. Epub 2018 Oct 17.

A Survey on Machine Learning Approaches for Automatic Detection of Voice Disorders.机器学习方法在自动语音障碍检测中的研究进展

J Voice. 2019 Nov;33(6):947.e11-947.e33. doi: 10.1016/j.jvoice.2018.07.014. Epub 2018 Oct 11.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

多倒谱投影表示策略在嗓音障碍检测中的实验分析。

An Experimental Analysis on Multicepstral Projection Representation Strategies for Dysphonia Detection.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献