Suppr超能文献

倒谱峰值敏感度:几种实现方式的理论分析与比较

Cepstral Peak Sensitivity: A Theoretic Analysis and Comparison of Several Implementations.

作者信息

Skowronski Mark D, Shrivastav Rahul, Hunter Eric J

机构信息

Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan.

Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan.

出版信息

J Voice. 2015 Nov;29(6):670-81. doi: 10.1016/j.jvoice.2014.11.005. Epub 2015 May 2.

Abstract

OBJECTIVE

The aim of this study was to develop a theoretic analysis of the cepstral peak (CP), to compare several CP software programs, and to propose methods for reducing variability in CP estimation.

STUDY DESIGN

Descriptive, experimental study.

METHODS

The theoretic CP value of a pulse train was derived and compared with estimates computed for pulse train WAV files using available CP software programs: (1) Hillenbrand's CP prominence (CPP) software (Western Michigan University, Kalamazoo, MI), (2) KayPENTAX (Montvale, NJ) Multi-Speech implementation of CPP, and (3) a MATLAB (The Mathworks, Natick, MA, version R2014a) implementation using cepstral interpolation. The CP variation was also investigated for synthetic breathy vowels.

RESULTS

For pulse trains with period T samples, the theoretic CP is 1/2+ε/T, |ε|<0.1 for all pulse trains (ε=0 for integer T). For fundamental frequencies between 70 and 230Hz, the CP mean±standard deviation was 0.496±0.002 using cepstral interpolation and 0.29±0.03 using Hillenbrand's software, whereas CPP was 35.0±3.8dB using Hillenbrand's software and 20.5±2.7dB using KayPENTAX's software. The CP and CPP versus signal-to-noise ratio for synthetic breathy vowels were fit to a logistic model for the Hillenbrand (R(2)=0.92) and KayPENTAX (R(2)=0.82) estimators as well as an ideal estimator (R(2)=0.98), which used a period-synchronous analysis.

CONCLUSIONS

The findings indicate that several variables unrelated to the signal itself impact CP values, with some factors introducing large variability in CP values that would otherwise be attributed to the signal (eg, voice quality). Variability may be reduced by using a period-synchronous analysis with Hann windows.

摘要

目的

本研究旨在对谐波峰值(CP)进行理论分析,比较几种CP软件程序,并提出减少CP估计变异性的方法。

研究设计

描述性实验研究。

方法

推导脉冲序列的理论CP值,并与使用现有CP软件程序为脉冲序列WAV文件计算的估计值进行比较:(1)希伦布兰德的CP突出度(CPP)软件(西密歇根大学,卡拉马祖,密歇根州),(2)KayPENTAX(蒙特瓦尔,新泽西州)对CPP的多语音实现,以及(3)使用谐波插值的MATLAB(Mathworks公司,纳蒂克,马萨诸塞州,版本R2014a)实现。还对合成的带呼吸声的元音的CP变化进行了研究。

结果

对于周期为T个样本的脉冲序列,理论CP为1/2 + ε/T,对于所有脉冲序列,|ε|<0.1(对于整数T,ε = 0)。对于70至230Hz之间的基频,使用谐波插值时CP的平均值±标准差为0.496±0.002,使用希伦布兰德软件时为0.29±0.03,而使用希伦布兰德软件时CPP为35.0±3.8dB,使用KayPENTAX软件时为20.5±2.7dB。对于合成的带呼吸声的元音,希伦布兰德(R² = 0.92)和KayPENTAX(R² = 0.82)估计器以及使用周期同步分析的理想估计器(R² = 0.98)的CP和CPP与信噪比拟合到逻辑模型。

结论

研究结果表明,几个与信号本身无关的变量会影响CP值,一些因素会在CP值中引入很大的变异性,否则这些变异性会归因于信号(例如语音质量)。使用汉宁窗的周期同步分析可以减少变异性。

相似文献

6
Reliability of calculating the cepstral peak without linear regression analysis.
J Voice. 2004 Jun;18(2):203-8. doi: 10.1016/j.jvoice.2004.01.005.
7
Cepstral Analysis of Voice in Young Adults.嗓音的声道倒谱分析。
J Voice. 2022 Jan;36(1):43-49. doi: 10.1016/j.jvoice.2020.03.010. Epub 2020 Apr 24.
9
Cepstral analysis of voice in persons with vocal nodules.嗓音声学分析在声带小结患者中的应用。
J Voice. 2010 Nov;24(6):651-3. doi: 10.1016/j.jvoice.2009.07.008. Epub 2010 Feb 19.

引用本文的文献

9
The Perception of Breathiness in the Voices of Pediatric Speakers.小儿说话者声音中呼吸音的感知。
J Voice. 2019 Mar;33(2):204-213. doi: 10.1016/j.jvoice.2017.09.024. Epub 2017 Nov 20.

本文引用的文献

8
On first rahmonic amplitude in the analysis of synthesized aperiodic voice signals.
J Acoust Soc Am. 2006 Nov;120(5 Pt 1):2896-907. doi: 10.1121/1.2355483.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验