Suppr超能文献

使用电话语音的声学-语音特征的特征外推在辅音-元音环境中对停顿位置进行分类。

Classification of stop place in consonant-vowel contexts using feature extrapolation of acoustic-phonetic features in telephone speech.

机构信息

Department of Electrical and Electronic Engineering, Yonsei University, 134 Shinchon-dong, Seodaemun-gu, Seoul, Korea 120-749.

出版信息

J Acoust Soc Am. 2012 Feb;131(2):1536-46. doi: 10.1121/1.3672706.

Abstract

Knowledge-based speech recognition systems extract acoustic cues from the signal to identify speech characteristics. For channel-deteriorated telephone speech, acoustic cues, especially those for stop consonant place, are expected to be degraded or absent. To investigate the use of knowledge-based methods in degraded environments, feature extrapolation of acoustic-phonetic features based on Gaussian mixture models is examined. This process is applied to a stop place detection module that uses burst release and vowel onset cues for consonant-vowel tokens of English. Results show that classification performance is enhanced in telephone channel-degraded speech, with extrapolated acoustic-phonetic features reaching or exceeding performance using estimated Mel-frequency cepstral coefficients (MFCCs). Results also show acoustic-phonetic features may be combined with MFCCs for best performance, suggesting these features provide information complementary to MFCCs.

摘要

基于知识的语音识别系统从信号中提取声学线索来识别语音特征。对于信道恶化的电话语音,声学线索,特别是用于塞音位置的声学线索,预计会降级或缺失。为了研究在退化环境中使用基于知识的方法,研究了基于高斯混合模型的声学-语音特征的特征外推。该过程应用于一个用于英语的辅音-元音对的突发释放和元音起始线索的塞音位置检测模块。结果表明,在电话信道退化语音中,分类性能得到了提高,外推的声学-语音特征达到或超过了使用估计的梅尔频率倒谱系数 (MFCC) 的性能。结果还表明,声学-语音特征可以与 MFCC 结合使用以获得最佳性能,这表明这些特征提供了与 MFCC 互补的信息。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验