• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

自动语音识别中的语音生成知识。

Speech production knowledge in automatic speech recognition.

作者信息

King Simon, Frankel Joe, Livescu Karen, McDermott Erik, Richmond Korin, Wester Mirjam

机构信息

Centre for Speech Technology Research, University of Edinburgh, 2 Buccleuch Place, Edinburgh EH8 9LW, United Kingdom.

出版信息

J Acoust Soc Am. 2007 Feb;121(2):723-42. doi: 10.1121/1.2404622.

DOI:10.1121/1.2404622
PMID:17348495
Abstract

Although much is known about how speech is produced, and research into speech production has resulted in measured articulatory data, feature systems of different kinds, and numerous models, speech production knowledge is almost totally ignored in current mainstream approaches to automatic speech recognition. Representations of speech production allow simple explanations for many phenomena observed in speech which cannot be easily analyzed from either acoustic signal or phonetic transcription alone. In this article, a survey of a growing body of work in which such representations are used to improve automatic speech recognition is provided.

摘要

尽管人们对语音的产生方式已经有了很多了解,并且对语音产生的研究已经产生了可测量的发音数据、各种特征系统和众多模型,但在当前主流的自动语音识别方法中,语音产生知识几乎完全被忽视。语音产生的表征能够对语音中观察到的许多现象给出简单解释,而这些现象仅从声学信号或语音转录中是不容易分析出来的。本文对越来越多使用此类表征来改进自动语音识别的研究工作进行了综述。

相似文献

1
Speech production knowledge in automatic speech recognition.自动语音识别中的语音生成知识。
J Acoust Soc Am. 2007 Feb;121(2):723-42. doi: 10.1121/1.2404622.
2
Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion.基于与主体无关的声学-发音反转的发音特征的自动语音识别。
J Acoust Soc Am. 2011 Oct;130(4):EL251-7. doi: 10.1121/1.3634122.
3
A neural network model of the articulatory-acoustic forward mapping trained on recordings of articulatory parameters.一个基于发音参数记录训练的发音-声学正向映射神经网络模型。
J Acoust Soc Am. 2004 Oct;116(4 Pt 1):2354-64. doi: 10.1121/1.1715112.
4
Vocal tract representation in the recognition of cerebral palsied speech.声道特征在脑瘫语音识别中的应用。
J Speech Lang Hear Res. 2012 Aug;55(4):1190-207. doi: 10.1044/1092-4388(2011/11-0223). Epub 2012 Jan 23.
5
The contribution of phonation type to the perception of vocal emotions in German: an articulatory synthesis study.发声类型对德语中嗓音情绪感知的影响:一项发音合成研究。
J Acoust Soc Am. 2015 Mar;137(3):1503-12. doi: 10.1121/1.4906836.
6
Articulatory limit and extreme segmental reduction in Taiwan Mandarin.台湾普通话的发音限制与极端音段缩减
J Acoust Soc Am. 2013 Dec;134(6):4481. doi: 10.1121/1.4824930.
7
A study of regressive place assimilation in spontaneous speech and its implications for spoken word recognition.一项关于自发言语中回归性位置同化及其对口语单词识别影响的研究。
J Acoust Soc Am. 2007 Oct;122(4):2340-53. doi: 10.1121/1.2772226.
8
Recognizing articulatory gestures from speech for robust speech recognition.从语音中识别发音动作以实现鲁棒的语音识别。
J Acoust Soc Am. 2012 Mar;131(3):2270-87. doi: 10.1121/1.3682038.
9
Imprecise vowel articulation as a potential early marker of Parkinson's disease: effect of speaking task.不精确的元音发音作为帕金森病的潜在早期标志物:说话任务的影响。
J Acoust Soc Am. 2013 Sep;134(3):2171-81. doi: 10.1121/1.4816541.
10
A comparative study of human and parrot phonation: acoustic and articulatory correlates of vowels.人类与鹦鹉发声的比较研究:元音的声学和发音关联
J Acoust Soc Am. 1994 Aug;96(2 Pt 1):634-48. doi: 10.1121/1.410303.

引用本文的文献

1
Estimation of vocal fold physiology from voice acoustics using machine learning.利用机器学习从语音声学估计声带生理机能。
J Acoust Soc Am. 2020 Mar;147(3):EL264. doi: 10.1121/10.0000927.
2
Recognizing Whispered Speech Produced by an Individual with Surgically Reconstructed Larynx Using Articulatory Movement Data.利用发音运动数据识别接受喉部手术重建的个体所发出的低语语音。
Workshop Speech Lang Process Assist Technol. 2016 Sep;2016:80-86. doi: 10.21437/SLPAT.2016-14.
3
Speaker verification based on the fusion of speech acoustics and inverted articulatory signals.
基于语音声学与反向发音信号融合的说话人验证
Comput Speech Lang. 2016 Mar;36:196-211. doi: 10.1016/j.csl.2015.05.003. Epub 2015 May 22.
4
Advances in real-time magnetic resonance imaging of the vocal tract for speech science and technology research.用于语音科学与技术研究的声道实时磁共振成像进展。
APSIPA Trans Signal Inf Process. 2016;5. doi: 10.1017/ATSIP.2016.5. Epub 2016 Mar 31.
5
Directly data-derived articulatory gesture-like representations retain discriminatory information about phone categories.直接从数据中得出的类似发音手势的表征保留了有关音素类别的辨别信息。
Comput Speech Lang. 2016 Mar 1;36:330-346. doi: 10.1016/j.csl.2015.03.004. Epub 2015 Mar 21.
6
An Optimal Set of Flesh Points on Tongue and Lips for Speech-Movement Classification.用于语音运动分类的舌头和嘴唇上的最佳肉点集。
J Speech Lang Hear Res. 2016 Feb;59(1):15-26. doi: 10.1044/2015_JSLHR-S-14-0112.
7
Co-registration of speech production datasets from electromagnetic articulography and real-time magnetic resonance imaging.基于电磁关节造影和实时磁共振成像的言语产生数据集的联合配准
J Acoust Soc Am. 2014 Feb;135(2):EL115-21. doi: 10.1121/1.4862880.
8
Articulatory distinctiveness of vowels and consonants: a data-driven approach.元音和辅音的发音区别:一种数据驱动的方法。
J Speech Lang Hear Res. 2013 Oct;56(5):1539-51. doi: 10.1044/1092-4388(2013/12-0030). Epub 2013 Jul 9.
9
Modeling speech imitation and ecological learning of auditory-motor maps.建模听觉-运动图谱的言语模仿和生态学习。
Front Psychol. 2013 Jun 27;4:364. doi: 10.3389/fpsyg.2013.00364. Print 2013.
10
The use of phonetic motor invariants can improve automatic phoneme discrimination.利用语音运动不变量可以提高自动音素辨别能力。
PLoS One. 2011;6(9):e24055. doi: 10.1371/journal.pone.0024055. Epub 2011 Sep 1.