• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition.正则化说话人自适应 KL-HMM 在构音障碍语音识别中的应用。
IEEE Trans Neural Syst Rehabil Eng. 2017 Sep;25(9):1581-1591. doi: 10.1109/TNSRE.2017.2681691. Epub 2017 Mar 13.
2
Representation Learning Based Speech Assistive System for Persons With Dysarthria.基于表示学习的构音障碍患者语音辅助系统。
IEEE Trans Neural Syst Rehabil Eng. 2017 Sep;25(9):1510-1517. doi: 10.1109/TNSRE.2016.2638830. Epub 2016 Dec 13.
3
Estimation of phoneme-specific HMM topologies for the automatic recognition of dysarthric speech.用于语音识别的特定音位 HMM 拓扑结构的估计。
Comput Math Methods Med. 2013;2013:297860. doi: 10.1155/2013/297860. Epub 2013 Oct 8.
4
Improving Acoustic Models in TORGO Dysarthric Speech Database.改善 TORGO 构音障碍语音数据库中的声学模型。
IEEE Trans Neural Syst Rehabil Eng. 2018 Mar;26(3):637-645. doi: 10.1109/TNSRE.2018.2802914.
5
Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals.使用构音障碍(失真)语音信号的倒谱分析对隐马尔可夫模型/人工神经网络混合结构在模式识别应用中的研究。
Med Eng Phys. 2006 Oct;28(8):741-8. doi: 10.1016/j.medengphy.2005.11.002. Epub 2005 Dec 15.
6
Vocal tract representation in the recognition of cerebral palsied speech.声道特征在脑瘫语音识别中的应用。
J Speech Lang Hear Res. 2012 Aug;55(4):1190-207. doi: 10.1044/1092-4388(2011/11-0223). Epub 2012 Jan 23.
7
Automated Speech Rate Measurement in Dysarthria.构音障碍中的自动言语速率测量
J Speech Lang Hear Res. 2015 Jun;58(3):698-712. doi: 10.1044/2015_JSLHR-S-14-0242.
8
Automatic speech recognition and training for severely dysarthric users of assistive technology: the STARDUST project.针对严重构音障碍的辅助技术用户的自动语音识别与训练:星尘项目。
Clin Linguist Phon. 2006 Apr-May;20(2-3):149-56. doi: 10.1080/02699200400026884.
9
A Weighted Speaker-Specific Confusion Transducer-Based Augmentative and Alternative Speech Communication Aid for Dysarthric Speakers.基于加权说话人特定混淆转换器的增强和替代言语交际辅助工具,用于构音障碍说话人。
IEEE Trans Neural Syst Rehabil Eng. 2019 Feb;27(2):187-197. doi: 10.1109/TNSRE.2018.2887089. Epub 2018 Dec 17.
10
Speech Vision: An End-to-End Deep Learning-Based Dysarthric Automatic Speech Recognition System.言语视觉:基于端到端深度学习的构音障碍自动语音识别系统。
IEEE Trans Neural Syst Rehabil Eng. 2021;29:852-861. doi: 10.1109/TNSRE.2021.3076778. Epub 2021 May 7.

引用本文的文献

1
Automatic prediction of intelligible speaking rate for individuals with ALS from speech acoustic and articulatory samples.根据语音声学和发音样本自动预测肌萎缩侧索硬化症患者的可理解语速。
Int J Speech Lang Pathol. 2018 Nov;20(6):669-679. doi: 10.1080/17549507.2018.1508499. Epub 2018 Nov 8.

本文引用的文献

1
Representation Learning Based Speech Assistive System for Persons With Dysarthria.基于表示学习的构音障碍患者语音辅助系统。
IEEE Trans Neural Syst Rehabil Eng. 2017 Sep;25(9):1510-1517. doi: 10.1109/TNSRE.2016.2638830. Epub 2016 Dec 13.
2
Effects of listeners' working memory and noise on speech intelligibility in dysarthria.听众的工作记忆和噪音对构音障碍患者言语清晰度的影响。
Clin Linguist Phon. 2014 Oct;28(10):785-95. doi: 10.3109/02699206.2014.904443. Epub 2014 Apr 8.
3
Frequency of consonant articulation errors in dysarthric speech.构音障碍性言语中辅音发音错误的频率。
Clin Linguist Phon. 2010 Oct;24(10):759-70. doi: 10.3109/02699206.2010.497238.
4
Difficulties in automatic speech recognition of dysarthric speakers and implications for speech-based applications used by the elderly: a literature review.构音障碍者的自动语音识别困难及其对老年人使用的基于语音的应用的影响:文献综述。
Assist Technol. 2010 Summer;22(2):99-112; quiz 113-4. doi: 10.1080/10400435.2010.483646.
5
A fast learning algorithm for deep belief nets.一种用于深度信念网络的快速学习算法。
Neural Comput. 2006 Jul;18(7):1527-54. doi: 10.1162/neco.2006.18.7.1527.
6
Experiments with fast Fourier transform, linear predictive and cepstral coefficients in dysarthric speech recognition algorithms using hidden Markov Model.使用隐马尔可夫模型的构音障碍语音识别算法中快速傅里叶变换、线性预测和倒谱系数的实验。
IEEE Trans Neural Syst Rehabil Eng. 2005 Dec;13(4):558-61. doi: 10.1109/TNSRE.2005.856074.
7
Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals.使用构音障碍(失真)语音信号的倒谱分析对隐马尔可夫模型/人工神经网络混合结构在模式识别应用中的研究。
Med Eng Phys. 2006 Oct;28(8):741-8. doi: 10.1016/j.medengphy.2005.11.002. Epub 2005 Dec 15.
8
Experiments in dysarthric speech recognition using artificial neural networks.使用人工神经网络进行构音障碍语音识别的实验。
J Rehabil Res Dev. 1995 May;32(2):162-9.
9
Phonological disorders III: a procedure for assessing severity of involvement.语音障碍III:一种评估受累严重程度的方法。
J Speech Hear Disord. 1982 Aug;47(3):256-70. doi: 10.1044/jshd.4703.256.

正则化说话人自适应 KL-HMM 在构音障碍语音识别中的应用。

Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition.

出版信息

IEEE Trans Neural Syst Rehabil Eng. 2017 Sep;25(9):1581-1591. doi: 10.1109/TNSRE.2017.2681691. Epub 2017 Mar 13.

DOI:10.1109/TNSRE.2017.2681691
PMID:28320669
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5591083/
Abstract

This paper addresses the problem of recognizing the speech uttered by patients with dysarthria, which is a motor speech disorder impeding the physical production of speech. Patients with dysarthria have articulatory limitation, and therefore, they often have trouble in pronouncing certain sounds, resulting in undesirable phonetic variation. Modern automatic speech recognition systems designed for regular speakers are ineffective for dysarthric sufferers due to the phonetic variation. To capture the phonetic variation, Kullback-Leibler divergence-based hidden Markov model (KL-HMM) is adopted, where the emission probability of state is parameterized by a categorical distribution using phoneme posterior probabilities obtained from a deep neural network-based acoustic model. To further reflect speaker-specific phonetic variation patterns, a speaker adaptation method based on a combination of L2 regularization and confusion-reducing regularization, which can enhance discriminability between categorical distributions of the KL-HMM states while preserving speaker-specific information is proposed. Evaluation of the proposed speaker adaptation method on a database of several hundred words for 30 speakers consisting of 12 mildly dysarthric, 8 moderately dysarthric, and 10 non-dysarthric control speakers showed that the proposed approach significantly outperformed the conventional deep neural network-based speaker adapted system on dysarthric as well as non-dysarthric speech.

摘要

本文针对识别构音障碍患者语音的问题进行了研究。构音障碍是一种影响言语产生的运动性言语障碍,患者存在发音器官的运动控制障碍,因此常难以发出某些特定的音,导致语音出现可闻的变化。由于语音变化的存在,现代针对正常发音者设计的自动语音识别系统对于构音障碍患者并不适用。为了捕捉这种语音变化,本文采用了基于 Kullback-Leibler 散度的隐马尔可夫模型(KL-HMM),其中状态的发射概率通过使用基于深度神经网络的声学模型获得的音素后验概率来参数化类别分布。为了进一步反映说话人特定的语音变化模式,本文提出了一种基于 L2 正则化和混淆减少正则化相结合的说话人自适应方法,该方法可以在保持说话人特定信息的同时,增强 KL-HMM 状态的类别分布之间的可区分性。在由 12 名轻度构音障碍、8 名中度构音障碍和 10 名非构音障碍控制说话者组成的数百个单词的数据库上对所提出的说话人自适应方法进行评估的结果表明,与传统的基于深度神经网络的说话人自适应系统相比,该方法在构音障碍和非构音障碍语音上均显著提高了识别性能。