倒谱峰值敏感度：几种实现方式的理论分析与比较

Cepstral Peak Sensitivity: A Theoretic Analysis and Comparison of Several Implementations.

作者信息

Skowronski Mark D, Shrivastav Rahul, Hunter Eric J

机构信息

Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan.

出版信息

J Voice. 2015 Nov;29(6):670-81. doi: 10.1016/j.jvoice.2014.11.005. Epub 2015 May 2.

DOI:10.1016/j.jvoice.2014.11.005

PMID:25944288

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4630216/

Abstract

OBJECTIVE

The aim of this study was to develop a theoretic analysis of the cepstral peak (CP), to compare several CP software programs, and to propose methods for reducing variability in CP estimation.

STUDY DESIGN

Descriptive, experimental study.

METHODS

The theoretic CP value of a pulse train was derived and compared with estimates computed for pulse train WAV files using available CP software programs: (1) Hillenbrand's CP prominence (CPP) software (Western Michigan University, Kalamazoo, MI), (2) KayPENTAX (Montvale, NJ) Multi-Speech implementation of CPP, and (3) a MATLAB (The Mathworks, Natick, MA, version R2014a) implementation using cepstral interpolation. The CP variation was also investigated for synthetic breathy vowels.

RESULTS

For pulse trains with period T samples, the theoretic CP is 1/2+ε/T, |ε|<0.1 for all pulse trains (ε=0 for integer T). For fundamental frequencies between 70 and 230Hz, the CP mean±standard deviation was 0.496±0.002 using cepstral interpolation and 0.29±0.03 using Hillenbrand's software, whereas CPP was 35.0±3.8dB using Hillenbrand's software and 20.5±2.7dB using KayPENTAX's software. The CP and CPP versus signal-to-noise ratio for synthetic breathy vowels were fit to a logistic model for the Hillenbrand (R(2)=0.92) and KayPENTAX (R(2)=0.82) estimators as well as an ideal estimator (R(2)=0.98), which used a period-synchronous analysis.

CONCLUSIONS

The findings indicate that several variables unrelated to the signal itself impact CP values, with some factors introducing large variability in CP values that would otherwise be attributed to the signal (eg, voice quality). Variability may be reduced by using a period-synchronous analysis with Hann windows.

摘要

目的

本研究旨在对谐波峰值（CP）进行理论分析，比较几种CP软件程序，并提出减少CP估计变异性的方法。

研究设计

描述性实验研究。

方法

推导脉冲序列的理论CP值，并与使用现有CP软件程序为脉冲序列WAV文件计算的估计值进行比较：（1）希伦布兰德的CP突出度（CPP）软件（西密歇根大学，卡拉马祖，密歇根州），（2）KayPENTAX（蒙特瓦尔，新泽西州）对CPP的多语音实现，以及（3）使用谐波插值的MATLAB（Mathworks公司，纳蒂克，马萨诸塞州，版本R2014a）实现。还对合成的带呼吸声的元音的CP变化进行了研究。

结果

对于周期为T个样本的脉冲序列，理论CP为1/2 + ε/T，对于所有脉冲序列，|ε|<0.1（对于整数T，ε = 0）。对于70至230Hz之间的基频，使用谐波插值时CP的平均值±标准差为0.496±0.002，使用希伦布兰德软件时为0.29±0.03，而使用希伦布兰德软件时CPP为35.0±3.8dB，使用KayPENTAX软件时为20.5±2.7dB。对于合成的带呼吸声的元音，希伦布兰德（R² = 0.92）和KayPENTAX（R² = 0.82）估计器以及使用周期同步分析的理想估计器（R² = 0.98）的CP和CPP与信噪比拟合到逻辑模型。

结论

研究结果表明，几个与信号本身无关的变量会影响CP值，一些因素会在CP值中引入很大的变异性，否则这些变异性会归因于信号（例如语音质量）。使用汉宁窗的周期同步分析可以减少变异性。

相似文献

Cepstral Peak Sensitivity: A Theoretic Analysis and Comparison of Several Implementations.倒谱峰值敏感度：几种实现方式的理论分析与比较

J Voice. 2015 Nov;29(6):670-81. doi: 10.1016/j.jvoice.2014.11.005. Epub 2015 May 2.

A Comparison of Cepstral Peak Prominence Measures From Two Acoustic Analysis Programs.两个声学分析程序的谐波峰值突出度测量比较

J Voice. 2017 May;31(3):387.e1-387.e10. doi: 10.1016/j.jvoice.2016.09.012. Epub 2016 Oct 15.

Exploring the relationship between spectral and cepstral measures of voice and the Voice Handicap Index (VHI).探索嗓音的频谱和倒谱测量与嗓音障碍指数（VHI）之间的关系。

J Voice. 2014 Jul;28(4):430-9. doi: 10.1016/j.jvoice.2013.12.008. Epub 2014 Mar 31.

Cepstral analysis of voice in unilateral adductor vocal fold palsy.单侧声带内收性麻痹患者嗓音的倒频谱分析。

J Voice. 2011 May;25(3):326-9. doi: 10.1016/j.jvoice.2009.12.010. Epub 2010 Mar 25.

Predictive value and discriminant capacity of cepstral- and spectral-based measures during continuous speech.基于倒谱和谱的语音连续语音分析的预测价值和判别能力。

J Voice. 2013 Jul;27(4):393-400. doi: 10.1016/j.jvoice.2013.02.005. Epub 2013 May 16.

Reliability of calculating the cepstral peak without linear regression analysis.

J Voice. 2004 Jun;18(2):203-8. doi: 10.1016/j.jvoice.2004.01.005.

Cepstral Analysis of Voice in Young Adults.嗓音的声道倒谱分析。

J Voice. 2022 Jan;36(1):43-49. doi: 10.1016/j.jvoice.2020.03.010. Epub 2020 Apr 24.

Acoustic Analyses of Prolonged Vowels in Young Adults With Friedreich Ataxia.患有弗里德赖希共济失调的年轻成年人中长元音的声学分析。

J Voice. 2016 May;30(3):272-80. doi: 10.1016/j.jvoice.2015.05.008. Epub 2015 Oct 9.

Cepstral analysis of voice in persons with vocal nodules.嗓音声学分析在声带小结患者中的应用。

J Voice. 2010 Nov;24(6):651-3. doi: 10.1016/j.jvoice.2009.07.008. Epub 2010 Feb 19.

Use of spectral/cepstral analyses for differentiating normal from hypofunctional voices in sustained vowel and continuous speech contexts.使用频谱/倒频谱分析在持续元音和连续语音环境中区分正常音和功能低下音。

J Speech Lang Hear Res. 2011 Dec;54(6):1525-37. doi: 10.1044/1092-4388(2011/10-0209).

引用本文的文献

Breathy Vocal Quality, Background Noise, and Hearing Loss: How Do These Adverse Conditions Affect Speech Perception by Older Adults?呼吸音质、背景噪音与听力损失：这些不利状况如何影响老年人的言语感知？

Ear Hear. 2025;46(2):474-482. doi: 10.1097/AUD.0000000000001599. Epub 2024 Nov 4.

Longitudinal Evaluation of Cepstral Peak Prominence in Children.儿童谐波峰值突出度的纵向评估

J Voice. 2024 May 16. doi: 10.1016/j.jvoice.2024.04.019.

Immediate and long-term effects of speech treatment targets and intensive dosage on Parkinson's disease dysphonia and the speech motor network: Randomized controlled trial.言语治疗靶点和强化剂量对帕金森病发声障碍和言语运动网络的即刻和长期影响：随机对照试验。

Hum Brain Mapp. 2022 May;43(7):2328-2347. doi: 10.1002/hbm.25790. Epub 2022 Feb 10.

Respiration Rate Estimation Based on Independent Component Analysis of Accelerometer Data: Pilot Single-Arm Intervention Study.基于加速度计数据独立分量分析的呼吸率估计：初步单臂干预研究。

JMIR Mhealth Uhealth. 2020 Aug 10;8(8):e17803. doi: 10.2196/17803.

Radar-Based Detection of Respiration Rate with Adaptive Harmonic Quefrency Selection.基于雷达的呼吸率自适应谐频选择检测。

Sensors (Basel). 2020 Mar 13;20(6):1607. doi: 10.3390/s20061607.

Effects of Vocal Intensity and Fundamental Frequency on Cepstral Peak Prominence in Patients with Voice Disorders and Vocally Healthy Controls.嗓音障碍患者和嗓音健康对照者的声强和基频对倒频谱峰值凸起的影响。

J Voice. 2021 May;35(3):411-417. doi: 10.1016/j.jvoice.2019.11.015. Epub 2019 Dec 17.

The relationship between biomechanics of pharyngoesophageal segment and tracheoesophageal phonation.咽食管段的生物力学与气管食管发声的关系。

Sci Rep. 2019 Jul 5;9(1):9722. doi: 10.1038/s41598-019-46223-7.

Reproducibility of Voice Parameters: The Effect of Room Acoustics and Microphones.声音参数的可重复性：房间 acoustics 和麦克风的影响。

J Voice. 2020 May;34(3):320-334. doi: 10.1016/j.jvoice.2018.10.016. Epub 2018 Nov 22.

The Perception of Breathiness in the Voices of Pediatric Speakers.小儿说话者声音中呼吸音的感知。

J Voice. 2019 Mar;33(2):204-213. doi: 10.1016/j.jvoice.2017.09.024. Epub 2017 Nov 20.

本文引用的文献

Acoustic parameters for classification of breathiness in continuous speech according to the GRBAS scale.根据GRBAS量表对连续语音中的呼吸音进行分类的声学参数。

J Voice. 2014 Sep;28(5):653.e9-653.e17. doi: 10.1016/j.jvoice.2013.07.016. Epub 2014 Apr 20.

Use of cepstral analyses for differentiating normal from dysphonic voices: a comparative study of connected speech versus sustained vowel in European Portuguese female speakers.运用谐波倒谱分析区分正常嗓音与发声障碍嗓音：欧洲葡萄牙语女性说话者连贯语音与持续元音的对比研究

J Voice. 2014 May;28(3):282-6. doi: 10.1016/j.jvoice.2013.10.001. Epub 2014 Feb 1.

Relation of perceived breathiness to laryngeal kinematics and acoustic measures based on computational modeling.基于计算模型的可感知粗糙声与喉运动学和声学测量的关系。

J Speech Lang Hear Res. 2013 Aug;56(4):1209-23. doi: 10.1044/1092-4388(2012/12-0194). Epub 2013 Jun 19.

Development of a glottal area index that integrates glottal gap size and open quotient.开发一种综合声门区间隙大小和声门开放率的声门区面积指数。

J Acoust Soc Am. 2013 Mar;133(3):1656-66. doi: 10.1121/1.4789931.

Quantifying dysphonia severity using a spectral/cepstral-based acoustic index: Comparisons with auditory-perceptual judgements from the CAPE-V.使用基于频谱/倒谱的声学指标量化发声障碍严重程度：与CAPE-V的听觉感知判断的比较。

Clin Linguist Phon. 2010 Sep;24(9):742-58. doi: 10.3109/02699206.2010.492446.

Acoustic measurement of overall voice quality: a meta-analysis.嗓音整体质量的声学测量：荟萃分析。

J Acoust Soc Am. 2009 Nov;126(5):2619-34. doi: 10.1121/1.3224706.

Toward improved ecological validity in the acoustic measurement of overall voice quality: combining continuous speech and sustained vowels.提高整体语音质量声学测量的生态有效性：结合连续语音和持续元音。

J Voice. 2010 Sep;24(5):540-55. doi: 10.1016/j.jvoice.2008.12.014. Epub 2009 Nov 2.

On first rahmonic amplitude in the analysis of synthesized aperiodic voice signals.

J Acoust Soc Am. 2006 Nov;120(5 Pt 1):2896-907. doi: 10.1121/1.2355483.

The effect of perceptual training on inexperienced listeners' judgments of dysphonic voice.感知训练对无经验听众判断嗓音障碍语音的影响。

J Voice. 2006 Dec;20(4):527-44. doi: 10.1016/j.jvoice.2005.08.007. Epub 2005 Dec 1.

Acoustic prediction of voice type in women with functional dysphonia.功能性发声障碍女性嗓音类型的声学预测

J Voice. 2005 Jun;19(2):268-82. doi: 10.1016/j.jvoice.2004.03.005.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验