Espy-Wilson C Y, Boyce S E, Jackson M, Narayanan S, Alwan A
Electrical and Computer Engineering Department, Boston University, Massachusetts 02215, USA.
J Acoust Soc Am. 2000 Jul;108(1):343-56. doi: 10.1121/1.429469.
Recent advances in physiological data collection methods have made it possible to test the accuracy of predictions against speaker-specific vocal tracts and acoustic patterns. Vocal tract dimensions for /r/ derived via magnetic-resonance imaging (MRI) for two speakers of American English [Alwan, Narayanan, and Haker, J. Acoust. Soc. Am. 101, 1078-1089 (1997)] were used to construct models of the acoustics of /r/. Because previous models have not sufficiently accounted for the very low F3 characteristic of /r/, the aim was to match formant frequencies predicted by the models to the full range of formant frequency values produced by the speakers in recordings of real words containing /r/. In one set of experiments, area functions derived from MRI data were used to argue that the Perturbation Theory of tube acoustics cannot adequately account for /r/, primarily because predicted locations did not match speakers' actual constriction locations. Different models of the acoustics of /r/ were tested using the Maeda computer simulation program [Maeda, Speech Commun. 1, 199-299 (1982)]; the supralingual vocal-tract dimensions reported in Alwan et al. were found to be adequate at predicting only the highest of attested F3 values. By using (1) a recently developed adaptation of the Maeda model that incorporates the sublingual space as a side branch from the front cavity, and by including (2) the sublingual space as an increment to the dimensions of the front cavity, the mid-to-low values of the speakers' F3 range were matched. Finally, a simple tube model with dimensions derived from MRI data was developed to account for cavity affiliations. This confirmed F3 as a front cavity resonance, and variations in F1, F2, and F4 as arising from mid- and back-cavity geometries. Possible trading relations for F3 lowering based on different acoustic mechanisms for extending the front cavity are also proposed.
生理数据收集方法的最新进展使得针对特定说话者的声道和声学模式来测试预测准确性成为可能。通过磁共振成像(MRI)得出的两位美式英语发音者发/r/音时的声道尺寸[阿尔万、纳拉亚南和哈克,《美国声学学会杂志》101, 1078 - 1089(1997)]被用于构建/r/音的声学模型。由于先前的模型没有充分考虑/r/音非常低的第三共振峰(F3)特征,目标是使模型预测的共振峰频率与发音者在包含/r/音的真实单词录音中产生的共振峰频率值的全范围相匹配。在一组实验中,从MRI数据得出的面积函数被用于论证管声学微扰理论不能充分解释/r/音,主要是因为预测位置与发音者实际的收缩位置不匹配。使用前田计算机模拟程序[前田,《语音通信》1, 199 - 299(1982)]测试了不同的/r/音声学模型;发现阿尔万等人报告的舌上声道尺寸仅能充分预测已证实的最高F3值。通过使用(1)前田模型的一种最近开发的变体,该变体将舌下空间纳入作为前腔的一个侧支,并且通过将(2)舌下空间作为前腔尺寸的一个增量,发音者F3范围的中低值得以匹配。最后,开发了一个具有从MRI数据得出的尺寸的简单管模型来解释腔的归属关系。这证实了F3是前腔共振,并且第一共振峰(F1)、第二共振峰(F2)和第四共振峰(F4)的变化是由中腔和后腔的几何形状引起的。还提出了基于扩展前腔的不同声学机制使F3降低的可能权衡关系。