波形整理视觉语音显示：设计与评估。

Wave collation visual speech display: design and evaluation.

作者信息

Mitchell P A, Easton R D

机构信息

Psychology Department, Boston College, Chestnut Hill, Massachusetts 02167.

出版信息

J Acoust Soc Am. 1995 Feb;97(2):1297-306. doi: 10.1121/1.412171.

DOI:10.1121/1.412171

PMID:7876449

Abstract

The wave collation display is a pitch-synchronous, time-domain visual speech display. Collation processing maps the speech waveform into a planar array, condensing the waveform and making speech information, including pitch contour and formant transitions, salient. Evaluation included both analytic evaluation and training. Analytic evaluation was based on a perceptual sorting task using untrained subjects. Subjects sorted printed speech display tokens by visual similarity in a match-to-exemplar design. Stimuli included vowels, with single speaker, multiple speakers, and multiple phonemic contexts, and voiceless consonants. Results for untrained subjects ranged from 73% correct (consonants) and 71% correct (vowels) for single speaker tokens to 46% correct (multiple speaker vowels). For comparison, analytic evaluation using spectrograms was also performed for vowels with single and multiple speakers. Overall results were statistically equivalent to the collation display, with 76% correct (single speaker vowels) and 44% correct (multiple speakers). In the training component, four subjects were trained on collation display sorting tasks as above; after mastering these tasks, generalization to novel stimuli was tested. The tasks were mastered in a few hours, and generalization to novel tokens from a familiar speaker was nearly perfect; generalization to unfamiliar speakers was imperfect.

摘要

波形整理显示是一种音高同步的时域视觉语音显示。整理处理将语音波形映射到一个平面阵列中，压缩波形并使包括音高轮廓和共振峰过渡在内的语音信息变得显著。评估包括分析评估和训练。分析评估基于一项使用未经训练的受试者的感知分类任务。受试者在范例匹配设计中根据视觉相似性对打印的语音显示令牌进行分类。刺激包括元音，有单说话者、多说话者以及多种音素语境的情况，还有清辅音。未经训练的受试者的结果从单说话者令牌的73%正确（辅音）和71%正确（元音）到多说话者元音的46%正确。作为比较，还对单说话者和多说话者的元音进行了使用频谱图的分析评估。总体结果在统计学上与整理显示相当，单说话者元音为76%正确，多说话者为44%正确。在训练部分，四名受试者接受了上述整理显示分类任务的训练；掌握这些任务后，测试了对新刺激的泛化能力。这些任务在几个小时内就被掌握了，对熟悉说话者的新令牌的泛化几乎是完美的；对不熟悉说话者的泛化则不完美。