用于粗糙嗓音分类的两阶段倒谱分析过程。

A Two-Stage Cepstral Analysis Procedure for the Classification of Rough Voices.

机构信息

Department of Communication Sciences & Disorders, Bloomsburg University of Pennsylvania, Bloomsburg, Pennsylvania.

Department of Statistics, Penn State University, University Park, Pennsylvania.

出版信息

J Voice. 2020 Jan;34(1):9-19. doi: 10.1016/j.jvoice.2018.07.003. Epub 2018 Nov 1.

DOI:10.1016/j.jvoice.2018.07.003

PMID:30391019

Abstract

OBJECTIVES

The objective of this study was to investigate the ability of a two-stage method of cepstral peak identification to effectively discriminate rough vs breathy vs typical voice in sustained vowel productions. It was hypothesized that a dual-stage search for cepstral peak prominences (CPP's) above and below specified quefrency/F cutoffs would result in a CPP difference that would be characteristic of the rough, diplophonic voice type.

METHODOLOGY

Central one-second portions of sustained vowel /a/ productions were obtained from 90 subjects (rough, breathy, and normophonic voices). All voice samples were analyzed using a a two-stage cepstral analysis process in which a CPP difference value was obtained by identifying cepstral peaks above and below a lower limit for expected F (150 Hz for females and 90 Hz for males), called CPP and CPP respectively.

RESULTS

The CPP difference value was observed to be a highly significant predictor, with negative values for this parameter characteristic of a dominant subharmonic in the voice signal and the perception of diplophonic, rough voice. Correct classification of rough vs nonrough voice samples was 82.2% (sensitivity 0.80 and specificity 0.833). In the consideration of three group classification (breathy vs. normophonic vs. rough), models incorporating two predictors (the CPP obtained from a single search through a 60 to 300 Hz frequency range (CPP) and the CPP difference value) correctly classified 78.88% of the voice samples.

CONCLUSIONS

Rough, diplophonic voices were consistently observed to have a subharmonic peak that was greater in amplitude than the cepstral peak obtained within the region of the expected F, resulting in a negative value for the CPP difference. The two-stage cepstral analysis process described herein is visually intuitive from the graphical display of a cepstrum and is a simple extended calculation derived from cepstral analysis procedures that have been recommended as essential in the acoustic description of vocal quality.

摘要

目的

本研究的目的是探讨双阶段声门波峰值识别方法在持续元音发声中有效区分粗糙声、气息声和典型声的能力。假设在指定的频率/F 截止值上下搜索声门波峰值突起（CPP）的双阶段搜索将产生一个 CPP 差异，该差异将是粗糙、双音声类型的特征。

方法

从 90 名受试者（粗糙声、气息声和正常声）中获取持续元音/a/的中央 1 秒部分。使用双阶段声门波分析过程对所有语音样本进行分析，在该过程中，通过识别低于预期 F 的下限（女性 150Hz，男性 90Hz）的声门波峰值来获得 CPP 差异值，分别称为 CPP 和 CPP。

结果

CPP 差异值是一个高度显著的预测因子，该参数的负值特征是语音信号中主导次谐波和双音、粗糙声音的感知。该参数正确分类粗糙声和非粗糙声样本的比例为 82.2%（灵敏度 0.80，特异性 0.833）。在考虑三组分类（气息声、正常声和粗糙声）时，包含两个预测因子（通过 60 至 300Hz 频率范围（CPP）进行单次搜索获得的 CPP 和 CPP 差异值）的模型正确分类了 78.88%的语音样本。

结论

粗糙、双音声音始终表现出一个幅度大于预期 F 区域内获得的声门波峰值的次谐波峰值，导致 CPP 差异值为负值。本文所述的双阶段声门波分析过程从声门波谱的图形显示来看是直观的，并且是从已经推荐作为嗓音质量声学描述的关键的声门波分析过程中衍生出的简单扩展计算。

相似文献

A Two-Stage Cepstral Analysis Procedure for the Classification of Rough Voices.

J Voice. 2020 Jan;34(1):9-19. doi: 10.1016/j.jvoice.2018.07.003. Epub 2018 Nov 1.

Predictive value and discriminant capacity of cepstral- and spectral-based measures during continuous speech.

J Voice. 2013 Jul;27(4):393-400. doi: 10.1016/j.jvoice.2013.02.005. Epub 2013 May 16.

Investigating the cepstral acoustic characteristics of voice in healthy children.

Int J Pediatr Otorhinolaryngol. 2021 Sep;148:110815. doi: 10.1016/j.ijporl.2021.110815. Epub 2021 Jun 29.

Auditory Perception of Roughness and Breathiness by Dysphonic Women.

J Voice. 2024 Sep;38(5):1249.e1-1249.e18. doi: 10.1016/j.jvoice.2022.01.005. Epub 2022 Jan 23.

Use of cepstral analysis for differentiating dysphonic from normal voices in children.

Int J Pediatr Otorhinolaryngol. 2019 Jan;116:107-113. doi: 10.1016/j.ijporl.2018.10.029. Epub 2018 Oct 23.

Perceptual and Quantitative Assessment of Dysphonia Across Vowel Categories.

J Voice. 2019 Jul;33(4):473-481. doi: 10.1016/j.jvoice.2017.12.018. Epub 2018 May 24.

Acoustic and Perceptual Classification of Within-sample Normal, Intermittently Dysphonic, and Consistently Dysphonic Voice Types.

J Voice. 2017 Mar;31(2):218-228. doi: 10.1016/j.jvoice.2016.04.016. Epub 2016 May 27.

Validation of the Cepstral Spectral Index of Dysphonia (CSID) as a Screening Tool for Voice Disorders: Development of Clinical Cutoff Scores.

J Voice. 2016 Mar;30(2):130-44. doi: 10.1016/j.jvoice.2015.04.009. Epub 2015 Sep 8.

A Comparison of Cepstral Peak Prominence Measures From Two Acoustic Analysis Programs.

J Voice. 2017 May;31(3):387.e1-387.e10. doi: 10.1016/j.jvoice.2016.09.012. Epub 2016 Oct 15.

Exploring the relationship between spectral and cepstral measures of voice and the Voice Handicap Index (VHI).

J Voice. 2014 Jul;28(4):430-9. doi: 10.1016/j.jvoice.2013.12.008. Epub 2014 Mar 31.

引用本文的文献

A multivariate model incorporating subharmonic measurements for evaluating vocal roughness.

NPJ Digit Med. 2025 May 20;8(1):295. doi: 10.1038/s41746-025-01702-2.

Acoustic estimation of voice roughness.

Atten Percept Psychophys. 2025 Apr 28. doi: 10.3758/s13414-025-03060-3.

Breathy Vocal Quality, Background Noise, and Hearing Loss: How Do These Adverse Conditions Affect Speech Perception by Older Adults?

Ear Hear. 2025;46(2):474-482. doi: 10.1097/AUD.0000000000001599. Epub 2024 Nov 4.

[Current methods of acoustic analysis of voice: a review].

Lin Chuang Er Bi Yan Hou Tou Jing Wai Ke Za Zhi. 2022 Dec;36(12):966-970;976. doi: 10.13201/j.issn.2096-7993.2022.12.016.

Interactions Between Breathy and Rough Voice Qualities and Their Contributions to Overall Dysphonia Severity.

J Speech Lang Hear Res. 2022 Nov 17;65(11):4071-4084. doi: 10.1044/2022_JSLHR-22-00012. Epub 2022 Oct 19.

Predicting Perceived Vocal Roughness Using a Bio-Inspired Computational Model of Auditory Temporal Envelope Processing.

J Speech Lang Hear Res. 2022 Aug 17;65(8):2748-2758. doi: 10.1044/2022_JSLHR-22-00101. Epub 2022 Jul 22.

[Detection of speech pathology based on parameters of analysis of dysphonia in speech and voice].

Lin Chuang Er Bi Yan Hou Tou Jing Wai Ke Za Zhi. 2022 Jul;36(7):492-496. doi: 10.13201/j.issn.2096-7993.2022.07.002.

Perceptual and Acoustic Assessment of Strain Using Synthetically Modified Voice Samples.

J Speech Lang Hear Res. 2020 Dec 14;63(12):3897-3908. doi: 10.1044/2020_JSLHR-20-00294. Epub 2020 Nov 5.

Cepstral Peak Prominence Values for Clinical Voice Evaluation.

Am J Speech Lang Pathol. 2020 Aug 4;29(3):1596-1607. doi: 10.1044/2020_AJSLP-20-00001. Epub 2020 Jul 13.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于粗糙嗓音分类的两阶段倒谱分析过程。

A Two-Stage Cepstral Analysis Procedure for the Classification of Rough Voices.

机构信息

出版信息

OBJECTIVES

METHODOLOGY

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献