Department of Electrical and Computer Engineering, University of Maryland College Park, College Park, Maryland 20742, USA.
Department of Communication Sciences and Disorders, University of Cincinnati, Cincinnati, Ohio 45267-0379, USA.
J Acoust Soc Am. 2024 Aug 1;156(2):1380-1390. doi: 10.1121/10.0028124.
For most of his illustrious career, Ken Stevens focused on examining and documenting the rich detail about vocal tract changes available to listeners underlying the acoustic signal of speech. Current approaches to speech inversion take advantage of this rich detail to recover information about articulatory movement. Our previous speech inversion work focused on movements of the tongue and lips, for which "ground truth" is readily available. In this study, we describe acquisition and validation of ground-truth articulatory data about velopharyngeal port constriction, using both the well-established measure of nasometry plus a novel technique-high-speed nasopharyngoscopy. Nasometry measures the acoustic output of the nasal and oral cavities to derive the measure nasalance. High-speed nasopharyngoscopy captures images of the nasopharyngeal region and can resolve velar motion during speech. By comparing simultaneously collected data from both acquisition modalities, we show that nasalance is a sufficiently sensitive measure to use as ground truth for our speech inversion system. Further, a speech inversion system trained on nasalance can recover known patterns of velopharyngeal port constriction shown by American English speakers. Our findings match well with Stevens' own studies of the acoustics of nasal consonants.
在他杰出的职业生涯中,KenStevens 主要专注于研究和记录言语声学信号背后听众可获取的有关声道变化的丰富细节。目前的语音反转方法利用了这种丰富的细节来恢复有关发音运动的信息。我们之前的语音反转工作主要集中在舌头和嘴唇的运动上,这些运动的“真实情况”很容易获得。在这项研究中,我们描述了使用经过充分验证的鼻测法加上一种新的高速鼻内窥镜技术,获取和验证关于软腭口缩小的发音运动的真实数据。鼻测法通过测量鼻腔和口腔的声学输出,得出鼻音值这一衡量标准。高速鼻内窥镜捕捉鼻咽区域的图像,并可以在说话时解析软腭运动。通过比较两种采集方式同时收集的数据,我们证明了鼻音值作为语音反转系统的真实情况是足够敏感的。此外,经过鼻音值训练的语音反转系统可以恢复美国英语使用者表现出的已知的软腭口缩小模式。我们的发现与 Stevens 自己对鼻辅音声学的研究非常吻合。