Johannes R S, Carr-Locke D L
Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts.
Endoscopy. 1992 Jul;24 Suppl 2:493-8. doi: 10.1055/s-2007-1010528.
Speech recognition technology has developed substantially in the past half decade. Currently, large vocabulary, speaker independent, discrete recognizers are the state-of-the-art. This will change. Moderate sized, continuous recognition systems now exist in research settings. However, it is unlikely that such systems will be widely available until the mid to late 1990's. The accuracy rates of current speech recognition systems are high. Consequently, speech accuracy is not the current limiting aspect of using ASR. The limiting aspect of using ASR technology is the approach to integrating speech functionality into applications. One approach is to use ATNs as models of natural language to support both an input strategy and a text generation system. ATNs provide approaches to both syntactical correctness and semantic richness. This is an approach which plays to the strengths of the discrete nature of current speech technology and also provides a methodology for the capture and archiving of highly detailed information. The ATN approach avoids the natural language parsing problem created by a fully free form dictation interface. Evolving along with the underlying speech technology are standards in the definitions and criteria used in endoscopic practice. There are clear benefits from standards in this area. However, it is likely that this will also take several years and may never yield a universally accepted lexicon. Furthermore, there will be user interface barriers to surmount in any system attempting to use speech as an input modality. Because of the relatively large vocabularies used in medical discourse, the user interface will need to be carefully crafted.(ABSTRACT TRUNCATED AT 250 WORDS)
在过去的五年里,语音识别技术有了长足的发展。目前,大词汇量、非特定人、离散式识别器代表了该领域的最高水平。但这种情况将会改变。如今在研究环境中已经存在中等规模的连续识别系统。然而,在20世纪90年代中期到后期之前,这类系统不太可能广泛应用。当前语音识别系统的准确率很高。因此,语音准确性并非目前使用自动语音识别(ASR)的限制因素。使用ASR技术的限制因素在于将语音功能集成到应用程序中的方法。一种方法是使用扩充转移网络(ATN)作为自然语言模型,以支持输入策略和文本生成系统。ATN为句法正确性和语义丰富性都提供了方法。这是一种利用当前语音技术离散特性优势的方法,还提供了一种捕获和存档高度详细信息的方法。ATN方法避免了完全自由形式听写界面所产生的自然语言解析问题。随着基础语音技术一同发展的还有内镜操作中所使用的定义和标准。这一领域的标准有明显益处。然而,这可能也需要数年时间,而且可能永远不会产生一个被普遍接受的词汇表。此外,任何试图将语音用作输入方式的系统都将面临用户界面方面的障碍。由于医学话语中使用的词汇量相对较大,用户界面需要精心设计。(摘要截选至250词)