Nuance Communications, 1198 East Arques Avenue, Sunnyvale, California 94085, USA.
J Acoust Soc Am. 2013 Jul;134(1):572-85. doi: 10.1121/1.4809540.
This study reports a detailed analysis of incorrect responses from an open-set spoken word recognition experiment of 1428 words designed to be a random sample of the entire American English lexicon. The stimuli were presented in six-talker babble to 192 young, normal-hearing listeners at three signal-to-noise ratios (0, +5, and +10 dB). The results revealed several patterns: (1) errors tended to have a higher frequency of occurrence than did the corresponding target word, and frequency of occurrence of error responses was significantly correlated with target frequency of occurrence; (2) incorrect responses were close to the target words in terms of number of phonemes and syllables but had a mean edit distance of 3; (3) for syllables, substitutions were much more frequent than either deletions or additions; for phonemes, deletions were slightly more frequent than substitutions; both were more frequent than additions; and (4) for errors involving just a single segment, substitutions were more frequent than either deletions or additions. The raw data are being made available to other researchers as supplementary material to form the beginnings of a database of speech errors collected under controlled laboratory conditions.
本研究报告了对 1428 个单词的开放式口语识别实验中错误回答的详细分析,这些单词旨在作为整个美式英语词汇的随机样本。刺激物以六位说话者的噪声呈现给 192 名年轻、听力正常的听众,信噪比为 0、+5 和+10dB。结果揭示了几种模式:(1)错误的出现频率往往高于相应的目标词,错误反应的出现频率与目标词的出现频率显著相关;(2)在音素和音节数量方面,错误反应与目标词相近,但平均编辑距离为 3;(3)对于音节,替换比删除或添加更常见;对于音素,删除比替换略常见;两者都比添加常见;(4)对于仅涉及单个片段的错误,替换比删除或添加更常见。原始数据作为补充材料提供给其他研究人员,形成了在受控实验室条件下收集的语音错误数据库的开端。