Suppr超能文献

一种基于语音特征的地标检测概率框架,用于自动语音识别。

A probabilistic framework for landmark detection based on phonetic features for automatic speech recognition.

作者信息

Juneja Amit, Espy-Wilson Carol

机构信息

Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742, USA.

出版信息

J Acoust Soc Am. 2008 Feb;123(2):1154-68. doi: 10.1121/1.2823754.

Abstract

A probabilistic framework for a landmark-based approach to speech recognition is presented for obtaining multiple landmark sequences in continuous speech. The landmark detection module uses as input acoustic parameters (APs) that capture the acoustic correlates of some of the manner-based phonetic features. The landmarks include stop bursts, vowel onsets, syllabic peaks and dips, fricative onsets and offsets, and sonorant consonant onsets and offsets. Binary classifiers of the manner phonetic features-syllabic, sonorant and continuant-are used for probabilistic detection of these landmarks. The probabilistic framework exploits two properties of the acoustic cues of phonetic features-(1) sufficiency of acoustic cues of a phonetic feature for a probabilistic decision on that feature and (2) invariance of the acoustic cues of a phonetic feature with respect to other phonetic features. Probabilistic landmark sequences are constrained using manner class pronunciation models for isolated word recognition with known vocabulary. The performance of the system is compared with (1) the same probabilistic system but with mel-frequency cepstral coefficients (MFCCs), (2) a hidden Markov model (HMM) based system using APs and (3) a HMM based system using MFCCs.

摘要

提出了一种基于地标方法的语音识别概率框架,用于在连续语音中获取多个地标序列。地标检测模块使用声学参数(APs)作为输入,这些参数捕捉了一些基于发音方式的语音特征的声学关联。地标包括塞音爆破音、元音起始、音节峰值和谷值、擦音起始和结束,以及响音辅音起始和结束。基于发音方式语音特征(音节、响音和延续音)的二元分类器用于这些地标的概率检测。该概率框架利用了语音特征声学线索的两个特性——(1)语音特征的声学线索对于该特征的概率决策的充分性,以及(2)语音特征的声学线索相对于其他语音特征的不变性。使用方式类发音模型对概率地标序列进行约束,以用于已知词汇的孤立词识别。将该系统的性能与以下系统进行比较:(1)相同的概率系统,但使用梅尔频率倒谱系数(MFCCs);(2)基于隐马尔可夫模型(HMM)的使用APs的系统;以及(3)基于HMM的使用MFCCs的系统。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验