Titze I R, Liang H
Department of Speech Pathology and Audiology, University of Iowa, Iowa City.
J Speech Hear Res. 1993 Dec;36(6):1120-33. doi: 10.1044/jshr.3606.1120.
Voice perturbation measures, such as jitter and shimmer, depend on accurate extraction of fundamental frequency (Fo) and amplitude of various waveform types. The extraction method directly affects the accuracy of the measures, particularly if several waveform types (with or without formant structure) are under consideration and if noise and modulation are present in the signal. For frequency perturbation, high precision is defined here as the ability to extract Fo to +/- 0.01% under conditions of noise and modulation. Three Fo-extraction methods and their software implementations are discussed and compared. The methods are cycle-to-cycle waveform matching, zero-crossing and peak-picking. Interpolation between samples is added to make the extractions more accurate and reliable. The sensitivity of the methods to different parameters such as sampling frequency, mean Fo, signal-to-noise ratio, frequency modulation, and amplitude modulation are explored.
诸如抖动和闪烁等嗓音微扰测量方法,取决于对各种波形类型的基频(Fo)和幅度的准确提取。提取方法直接影响测量的准确性,特别是当考虑几种波形类型(有无共振峰结构)以及信号中存在噪声和调制时。对于频率微扰,这里将高精度定义为在噪声和调制条件下将Fo提取到±0.01%的能力。讨论并比较了三种Fo提取方法及其软件实现。这些方法是逐周期波形匹配、过零检测和峰值检测。在样本之间添加插值以使提取更准确可靠。探讨了这些方法对不同参数(如采样频率、平均Fo、信噪比、频率调制和幅度调制)的敏感性。