Skowronski Mark D, Shrivastav Rahul, Hunter Eric J
Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan.
Department of Communicative Sciences and Disorders, Michigan State University, East Lansing, Michigan.
J Voice. 2015 Nov;29(6):670-81. doi: 10.1016/j.jvoice.2014.11.005. Epub 2015 May 2.
The aim of this study was to develop a theoretic analysis of the cepstral peak (CP), to compare several CP software programs, and to propose methods for reducing variability in CP estimation.
Descriptive, experimental study.
The theoretic CP value of a pulse train was derived and compared with estimates computed for pulse train WAV files using available CP software programs: (1) Hillenbrand's CP prominence (CPP) software (Western Michigan University, Kalamazoo, MI), (2) KayPENTAX (Montvale, NJ) Multi-Speech implementation of CPP, and (3) a MATLAB (The Mathworks, Natick, MA, version R2014a) implementation using cepstral interpolation. The CP variation was also investigated for synthetic breathy vowels.
For pulse trains with period T samples, the theoretic CP is 1/2+ε/T, |ε|<0.1 for all pulse trains (ε=0 for integer T). For fundamental frequencies between 70 and 230Hz, the CP mean±standard deviation was 0.496±0.002 using cepstral interpolation and 0.29±0.03 using Hillenbrand's software, whereas CPP was 35.0±3.8dB using Hillenbrand's software and 20.5±2.7dB using KayPENTAX's software. The CP and CPP versus signal-to-noise ratio for synthetic breathy vowels were fit to a logistic model for the Hillenbrand (R(2)=0.92) and KayPENTAX (R(2)=0.82) estimators as well as an ideal estimator (R(2)=0.98), which used a period-synchronous analysis.
The findings indicate that several variables unrelated to the signal itself impact CP values, with some factors introducing large variability in CP values that would otherwise be attributed to the signal (eg, voice quality). Variability may be reduced by using a period-synchronous analysis with Hann windows.
本研究旨在对谐波峰值(CP)进行理论分析,比较几种CP软件程序,并提出减少CP估计变异性的方法。
描述性实验研究。
推导脉冲序列的理论CP值,并与使用现有CP软件程序为脉冲序列WAV文件计算的估计值进行比较:(1)希伦布兰德的CP突出度(CPP)软件(西密歇根大学,卡拉马祖,密歇根州),(2)KayPENTAX(蒙特瓦尔,新泽西州)对CPP的多语音实现,以及(3)使用谐波插值的MATLAB(Mathworks公司,纳蒂克,马萨诸塞州,版本R2014a)实现。还对合成的带呼吸声的元音的CP变化进行了研究。
对于周期为T个样本的脉冲序列,理论CP为1/2 + ε/T,对于所有脉冲序列,|ε|<0.1(对于整数T,ε = 0)。对于70至230Hz之间的基频,使用谐波插值时CP的平均值±标准差为0.496±0.002,使用希伦布兰德软件时为0.29±0.03,而使用希伦布兰德软件时CPP为35.0±3.8dB,使用KayPENTAX软件时为20.5±2.7dB。对于合成的带呼吸声的元音,希伦布兰德(R² = 0.92)和KayPENTAX(R² = 0.82)估计器以及使用周期同步分析的理想估计器(R² = 0.98)的CP和CPP与信噪比拟合到逻辑模型。
研究结果表明,几个与信号本身无关的变量会影响CP值,一些因素会在CP值中引入很大的变异性,否则这些变异性会归因于信号(例如语音质量)。使用汉宁窗的周期同步分析可以减少变异性。