• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

病理性音素产生的自动识别。

Automatic recognition of pathological phoneme production.

作者信息

Wielgat Robert, Zieliński Tomasz P, Woźniak Tomasz, Grabias Stanisław, Król Daniel

机构信息

Department of Technology, Higher State Vocational School in Tarnów, Tarnów, Poland.

出版信息

Folia Phoniatr Logop. 2008;60(6):323-31. doi: 10.1159/000170083. Epub 2008 Nov 11.

DOI:10.1159/000170083
PMID:19011305
Abstract

OBJECTIVE

Proper diagnosis and therapy of pathological pronunciation of phonemes play an important role in modern logopedics. To enhance the efficiency of diagnosis and therapy an automatic recognition of pathological phoneme pronunciation is addressed in this paper. The authors focus on the therapy of phoneme substitution disorders.

PATIENTS AND METHODS

Recognized speech samples come from speech-impaired Polish children and partially from persons imitating speech disorders. Recognized speech disorders were substitutions in pairs (for the correct phonetic charactors please see online article) embedded in Polish carrier words. In order to detect substitutions in the recognized words, recently proposed human factor cepstral coefficients (HFCC) have been implemented. Efficiency of the HFCC approach was compared to the application of standard mel-frequency cepstral coefficients (MFCC) as a feature vector. Both dynamic time warping (DTW), working on whole words or embedded phoneme patterns, and hidden Markov models (HMM) were used as classifiers. The HMM classifier was based on whole-word models as well as phoneme models. Results present a comparative analysis of DTW and HMM methods.

CONCLUSIONS

The superiority of HFCC features over those of MFCC was demonstrated. Results obtained by DTW methods, mainly by modified phoneme-based DTW classifier, were slightly better in comparison with the HMM classifier. Results obtained for the detection of substitution in pairs (for the correct phonetic charactors please see online article) are very promising. The methods developed for these cases can be integrated into computer systems for speech therapy. For substitutions in pairs (for the correct phonetic charactors please see online article) further research is necessary.

摘要

目的

音素病理发音的正确诊断和治疗在现代言语治疗中起着重要作用。为提高诊断和治疗效率,本文探讨了病理性音素发音的自动识别。作者着重于音素替代障碍的治疗。

患者与方法

识别的语音样本来自波兰有言语障碍的儿童,部分来自模仿言语障碍的人。识别出的言语障碍是波兰载体词中嵌入的成对替代(正确的语音字符请见在线文章)。为检测识别词中的替代,已采用最近提出的人为因素倒谱系数(HFCC)。将HFCC方法的效率与作为特征向量的标准梅尔频率倒谱系数(MFCC)的应用进行比较。动态时间规整(DTW),可处理整个单词或嵌入的音素模式,以及隐马尔可夫模型(HMM)均用作分类器。HMM分类器基于全词模型以及音素模型。结果呈现了DTW和HMM方法的对比分析。

结论

证明了HFCC特征优于MFCC特征。与HMM分类器相比,DTW方法(主要是改进的基于音素的DTW分类器)获得的结果略好。成对替代检测(正确的语音字符请见在线文章)所获得的结果很有前景。为这些情况开发的方法可集成到言语治疗的计算机系统中。对于成对替代(正确的语音字符请见在线文章),还需要进一步研究。

相似文献

1
Automatic recognition of pathological phoneme production.病理性音素产生的自动识别。
Folia Phoniatr Logop. 2008;60(6):323-31. doi: 10.1159/000170083. Epub 2008 Nov 11.
2
Dynamic time warping in phoneme modeling for fast pronunciation error detection.基于动态时间规整的音素建模在快速发音错误检测中的应用。
Comput Biol Med. 2016 Feb 1;69:277-85. doi: 10.1016/j.compbiomed.2015.12.004. Epub 2015 Dec 17.
3
A probabilistic framework for landmark detection based on phonetic features for automatic speech recognition.一种基于语音特征的地标检测概率框架,用于自动语音识别。
J Acoust Soc Am. 2008 Feb;123(2):1154-68. doi: 10.1121/1.2823754.
4
Analysis and prediction of acoustic speech features from mel-frequency cepstral coefficients in distributed speech recognition architectures.分布式语音识别架构中基于梅尔频率倒谱系数的声学语音特征分析与预测
J Acoust Soc Am. 2008 Dec;124(6):3989-4000. doi: 10.1121/1.2997436.
5
Reliability and clinical relevance of segmental analysis based on intelligibility assessment.基于可懂度评估的节段性分析的可靠性及临床相关性
Folia Phoniatr Logop. 2008;60(5):264-8. doi: 10.1159/000153433. Epub 2008 Sep 9.
6
Statistical modeling of speech Poincaré sections in combination of frequency analysis to improve speech recognition performance.联合频率分析的语音庞加莱截面的统计建模以提高语音识别性能。
Chaos. 2010 Sep;20(3):033106. doi: 10.1063/1.3463722.
7
Exploiting independent filter bandwidth of human factor cepstral coefficients in automatic speech recognition.在自动语音识别中利用人为因素倒谱系数的独立滤波器带宽
J Acoust Soc Am. 2004 Sep;116(3):1774-80. doi: 10.1121/1.1777872.
8
[Phoneme analysis and phoneme discrimination of juvenile speech therapy school students].[青少年言语治疗学校学生的音素分析与音素辨别]
Laryngorhinootologie. 2011 May;90(5):282-9. doi: 10.1055/s-0030-1265166. Epub 2010 Oct 12.
9
Sign language recognition by combining statistical DTW and independent classification.通过结合统计动态时间规整和独立分类进行手语识别。
IEEE Trans Pattern Anal Mach Intell. 2008 Nov;30(11):2040-6. doi: 10.1109/TPAMI.2008.123.
10
Improved phoneme-based myoelectric speech recognition.基于音素的改进型肌电语音识别。
IEEE Trans Biomed Eng. 2009 Aug;56(8):2016-23. doi: 10.1109/TBME.2009.2024079. Epub 2009 Jun 16.

引用本文的文献

1
Speech phoneme and spectral smearing based non-invasive COVID-19 detection.基于语音音素和频谱模糊的无创新冠病毒检测
Front Artif Intell. 2023 Jan 4;5:1035805. doi: 10.3389/frai.2022.1035805. eCollection 2022.
2
Modulation Spectra Morphological Parameters: A New Method to Assess Voice Pathologies according to the GRBAS Scale.调制频谱形态学参数:一种根据GRBAS量表评估嗓音疾病的新方法。
Biomed Res Int. 2015;2015:259239. doi: 10.1155/2015/259239. Epub 2015 Oct 18.
3
Formant analysis in dysphonic patients and automatic Arabic digit speech recognition.
嗓音障碍患者的共振峰分析与自动阿拉伯数字语音识别。
Biomed Eng Online. 2011 May 30;10:41. doi: 10.1186/1475-925X-10-41.