Detmer W M, Shiffman S, Wyatt J C, Friedman C P, Lane C D, Fagan L M
Section on Medical Informatics, Stanford University School of Medicine, CA 94305-5479.
J Am Med Inform Assoc. 1995 Jan-Feb;2(1):46-57. doi: 10.1136/jamia.1995.95202548.
Evaluate the performance of a continuous-speech interface to a decision support system.
The authors performed a prospective evaluation of a speech interface that matches unconstrained utterances of physicians with controlled-vocabulary terms from Quick Medical Reference (QMR). The performance of the speech interface was assessed in two stages: in the real-time experiment, physician subjects viewed audiovisual stimuli intended to evoke clinical findings, spoke a description of each finding into the speech interface, and then chose from a list generated by the interface the QMR term that most closely matched the finding. Subjects believed that the speech recognizer decoded their utterances; in reality, a hidden experimenter typed utterances into the interface (Wizard-of-Oz experimental design). Later, the authors replayed the same utterances through the speech recognizer and measured how accurately utterances matched with appropriate QMR terms using the results of the real-time experiment as the "gold standard."
The authors measured how accurately the speech-recognition system converted input utterances to text strings (recognition accuracy) and how accurately the speech interface matched input utterances to appropriate QMR terms (semantic accuracy).
Overall recognition accuracy was less than 50%. However, using language-processing techniques that match keywords in recognized utterances to keywords in QMR terms, the semantic accuracy of the system was 81%.
Reasonable semantic accuracy was attained when language-processing techniques were used to accommodate for speech misrecognition. In addition, the Wizard-of-Oz experimental design offered many advantages for this evaluation. The authors believe that this technique may be useful to future evaluators of speech-input systems.
评估决策支持系统的连续语音接口性能。
作者对一个语音接口进行了前瞻性评估,该接口将医生的无约束话语与快速医学参考(QMR)中的控制词汇术语进行匹配。语音接口的性能分两个阶段进行评估:在实时实验中,医生受试者观看旨在引发临床发现的视听刺激,向语音接口说出每个发现的描述,然后从接口生成的列表中选择与该发现最匹配的QMR术语。受试者认为语音识别器对他们的话语进行了解码;实际上,一名隐藏的实验者将话语输入到接口中(奥兹巫师实验设计)。后来,作者通过语音识别器重放相同的话语,并以实时实验的结果作为“金标准”,测量话语与适当的QMR术语匹配的准确程度。
作者测量了语音识别系统将输入话语转换为文本字符串的准确程度(识别准确率),以及语音接口将输入话语与适当的QMR术语匹配的准确程度(语义准确率)。
总体识别准确率低于50%。然而,使用将识别话语中的关键词与QMR术语中的关键词进行匹配的语言处理技术,系统的语义准确率为81%。
当使用语言处理技术来弥补语音误识别时,可获得合理的语义准确率。此外,奥兹巫师实验设计为该评估提供了许多优势。作者认为该技术可能对未来语音输入系统的评估者有用。