• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

残差神经网络基于短时长语音片段精确量化构音障碍的严重程度等级。

Residual Neural Network precisely quantifies dysarthria severity-level based on short-duration speech segments.

作者信息

Gupta Siddhant, Patil Ankur T, Purohit Mirali, Parmar Mihir, Patel Maitreya, Patil Hemant A, Guido Rodrigo Capobianco

机构信息

Speech Research Lab, Dhirubhai Ambani Institute of Information and Communication Technology (DA-IICT), Gandhinagar 382007, India.

Arizona State University, Tempe, USA.

出版信息

Neural Netw. 2021 Jul;139:105-117. doi: 10.1016/j.neunet.2021.02.008. Epub 2021 Feb 24.

DOI:10.1016/j.neunet.2021.02.008
PMID:33684609
Abstract

Recently, we have witnessed Deep Learning methodologies gaining significant attention for severity-based classification of dysarthric speech. Detecting dysarthria, quantifying its severity, are of paramount importance in various real-life applications, such as the assessment of patients' progression in treatments, which includes an adequate planning of their therapy and the improvement of speech-based interactive systems in order to handle pathologically-affected voices automatically. Notably, current speech-powered tools often deal with short-duration speech segments and, consequently, are less efficient in dealing with impaired speech, even by using Convolutional Neural Networks (CNNs). Thus, detecting dysarthria severity-level based on short speech segments might help in improving the performance and applicability of those systems. To achieve this goal, we propose a novel Residual Network (ResNet)-based technique which receives short-duration speech segments as input. Statistically meaningful objective analysis of our experiments, reported over standard Universal Access corpus, exhibits average values of 21.35% and 22.48% improvement, compared to the baseline CNN, in terms of classification accuracy and F1-score, respectively. For additional comparisons, tests with Gaussian Mixture Models and Light CNNs were also performed. Overall, the values of 98.90% and 98.00% for classification accuracy and F1-score, respectively, were obtained with the proposed ResNet approach, confirming its efficacy and reassuring its practical applicability.

摘要

最近,我们目睹深度学习方法在基于严重程度的构音障碍语音分类中受到了广泛关注。在各种实际应用中,检测构音障碍并量化其严重程度至关重要,例如评估患者的治疗进展,这包括对其治疗进行适当规划以及改进基于语音的交互系统,以便自动处理病理影响的声音。值得注意的是,当前的语音工具通常处理短时长的语音片段,因此,即使使用卷积神经网络(CNN),在处理受损语音时效率也较低。因此,基于短语音片段检测构音障碍严重程度可能有助于提高这些系统的性能和适用性。为了实现这一目标,我们提出了一种新颖的基于残差网络(ResNet)的技术,该技术接收短时长语音片段作为输入。在标准通用访问语料库上报告的我们实验的具有统计学意义的客观分析表明,与基线CNN相比,在分类准确率和F1分数方面分别有21.35%和22.48%的平均提升。为了进行更多比较,还使用高斯混合模型和轻量级CNN进行了测试。总体而言,所提出的ResNet方法分别获得了98.90%和98.00%的分类准确率和F1分数,证实了其有效性并确保了其实际适用性。

相似文献

1
Residual Neural Network precisely quantifies dysarthria severity-level based on short-duration speech segments.残差神经网络基于短时长语音片段精确量化构音障碍的严重程度等级。
Neural Netw. 2021 Jul;139:105-117. doi: 10.1016/j.neunet.2021.02.008. Epub 2021 Feb 24.
2
Dysarthric Speech Enhancement Based on Convolution Neural Network.基于卷积神经网络的构音障碍语音增强。
Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:60-64. doi: 10.1109/EMBC48229.2022.9871531.
3
Estimation of phoneme-specific HMM topologies for the automatic recognition of dysarthric speech.用于语音识别的特定音位 HMM 拓扑结构的估计。
Comput Math Methods Med. 2013;2013:297860. doi: 10.1155/2013/297860. Epub 2013 Oct 8.
4
Cascaded Convolutional Neural Network Architecture for Speech Emotion Recognition in Noisy Conditions.用于噪声环境下语音情感识别的级联卷积神经网络架构
Sensors (Basel). 2021 Jun 27;21(13):4399. doi: 10.3390/s21134399.
5
Automated Dysarthria Severity Classification: A Study on Acoustic Features and Deep Learning Techniques.自动构音障碍严重程度分类:声学特征与深度学习技术研究。
IEEE Trans Neural Syst Rehabil Eng. 2022;30:1147-1157. doi: 10.1109/TNSRE.2022.3169814. Epub 2022 May 4.
6
Multi-Stage Audio-Visual Fusion for Dysarthric Speech Recognition With Pre-Trained Models.基于预训练模型的构音障碍语音识别的多阶段视听融合
IEEE Trans Neural Syst Rehabil Eng. 2023;31:1912-1921. doi: 10.1109/TNSRE.2023.3262001.
7
Evaluation of an Automatic Speech Recognition Platform for Dysarthric Speech.用于构音障碍语音的自动语音识别平台评估
Folia Phoniatr Logop. 2021;73(5):432-441. doi: 10.1159/000511042. Epub 2020 Nov 13.
8
Investigation of an HMM/ANN hybrid structure in pattern recognition application using cepstral analysis of dysarthric (distorted) speech signals.使用构音障碍(失真)语音信号的倒谱分析对隐马尔可夫模型/人工神经网络混合结构在模式识别应用中的研究。
Med Eng Phys. 2006 Oct;28(8):741-8. doi: 10.1016/j.medengphy.2005.11.002. Epub 2005 Dec 15.
9
Improving Acoustic Models in TORGO Dysarthric Speech Database.改善 TORGO 构音障碍语音数据库中的声学模型。
IEEE Trans Neural Syst Rehabil Eng. 2018 Mar;26(3):637-645. doi: 10.1109/TNSRE.2018.2802914.
10
Evaluation of a speech recognition prototype for speakers with moderate and severe dysarthria: a preliminary report.评估一个针对中重度构音障碍患者的语音识别原型:初步报告。
Augment Altern Commun. 2010 Dec;26(4):267-77. doi: 10.3109/07434618.2010.532508.

引用本文的文献

1
Exploring the Role of Machine Learning in Diagnosing and Treating Speech Disorders: A Systematic Literature Review.探索机器学习在言语障碍诊断与治疗中的作用:一项系统文献综述。
Psychol Res Behav Manag. 2024 May 31;17:2205-2232. doi: 10.2147/PRBM.S460283. eCollection 2024.
2
SAMI: an M-Health application to telemonitor intelligibility and speech disorder severity in head and neck cancers.SAMI:一款用于远程监测头颈癌患者语言清晰度和言语障碍严重程度的移动健康应用程序。
Front Artif Intell. 2024 May 9;7:1359094. doi: 10.3389/frai.2024.1359094. eCollection 2024.
3
Pareto-Optimized Non-Negative Matrix Factorization Approach to the Cleaning of Alaryngeal Speech Signals.
用于清洁无喉语音信号的帕累托优化非负矩阵分解方法
Cancers (Basel). 2023 Jul 16;15(14):3644. doi: 10.3390/cancers15143644.
4
Dysarthria detection based on a deep learning model with a clinically-interpretable layer.基于具有临床可解释层的深度学习模型的构音障碍检测。
JASA Express Lett. 2023 Jan;3(1):015201. doi: 10.1121/10.0016833.
5
Research on the Filtering and Classification Method of Interactive Music Education Resources Based on Neural Network.基于神经网络的互动音乐教育资源过滤与分类方法研究。
Comput Intell Neurosci. 2022 Aug 17;2022:5764148. doi: 10.1155/2022/5764148. eCollection 2022.
6
Detection and differentiation of ataxic and hypokinetic dysarthria in cerebellar ataxia and parkinsonian disorders via wave splitting and integrating neural networks.通过波分裂和整合神经网络检测和区分小脑性共济失调和帕金森病障碍中的共济失调性和运动不能性构音障碍。
PLoS One. 2022 Jun 3;17(6):e0268337. doi: 10.1371/journal.pone.0268337. eCollection 2022.