Suppr超能文献

迈向终极合成/识别系统。

Toward the ultimate synthesis/recognition system.

作者信息

Furui S

机构信息

Nippon Telegraph and Telephone (NTT) Human Interface Laboratories, Tokyo, Japan.

出版信息

Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):10040-5. doi: 10.1073/pnas.92.22.10040.

Abstract

This paper predicts speech synthesis, speech recognition, and speaker recognition technology for the year 2001, and it describes the most important research problems to be solved in order to arrive at these ultimate synthesis and recognition systems. The problems for speech synthesis include natural and intelligible voice production, prosody control based on meaning, capability of controlling synthesized voice quality and choosing individual speaking style, multilingual and multidialectal synthesis, choice of application-oriented speaking styles, capability of adding emotion, and synthesis from concepts. The problems for speech recognition include robust recognition against speech variations, adaptation/normalization to variations due to environmental conditions and speakers, automatic knowledge acquisition for acoustic and linguistic modeling, spontaneous speech recognition, naturalness and ease of human-machine interaction, and recognition of emotion. The problems for speaker recognition are similar to those for speech recognition. The research topics related to all these techniques include the use of articulatory and perceptual constraints and evaluation methods for measuring the quality of technology and systems.

摘要

本文预测了2001年的语音合成、语音识别和说话人识别技术,并描述了为实现这些终极合成和识别系统而需要解决的最重要的研究问题。语音合成的问题包括自然且可理解的语音生成、基于语义的韵律控制、控制合成语音质量和选择个人说话风格的能力、多语言和多方言合成、面向应用的说话风格选择、添加情感的能力以及从概念进行合成。语音识别的问题包括针对语音变化的鲁棒识别、适应/归一化因环境条件和说话人导致的变化、用于声学和语言建模的自动知识获取、自发语音识别、人机交互的自然性和便捷性以及情感识别。说话人识别的问题与语音识别的问题类似。与所有这些技术相关的研究主题包括使用发音和感知约束以及用于衡量技术和系统质量的评估方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f02/40732/a6ba664216c7/pnas01500-0140-a.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验