• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于统计语音转换的混合系统改善食管语音识别的初步研究。

A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion.

作者信息

Lachhab Othman, Di Martino Joseph, Elhaj Elhassane Ibn, Hammouch Ahmed

机构信息

LRGE Laboratory, ENSET, Mohammed 5 University, Madinat Al Irfane, Rabat, Morocco.

LORIA, B.P. 239, Vandœuvre-lès-Nancy, 54506 France.

出版信息

Springerplus. 2015 Oct 26;4:644. doi: 10.1186/s40064-015-1428-2. eCollection 2015.

DOI:10.1186/s40064-015-1428-2
PMID:26543778
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4627987/
Abstract

In this paper, we propose a hybrid system based on a modified statistical GMM voice conversion algorithm for improving the recognition of esophageal speech. This hybrid system aims to compensate for the distorted information present in the esophageal acoustic features by using a voice conversion method. The esophageal speech is converted into a "target" laryngeal speech using an iterative statistical estimation of a transformation function. We did not apply a speech synthesizer for reconstructing the converted speech signal, given that the converted Mel cepstral vectors are used directly as input of our speech recognition system. Furthermore the feature vectors are linearly transformed by the HLDA (heteroscedastic linear discriminant analysis) method to reduce their size in a smaller space having good discriminative properties. The experimental results demonstrate that our proposed system provides an improvement of the phone recognition accuracy with an absolute increase of 3.40 % when compared with the phone recognition accuracy obtained with neither HLDA nor voice conversion.

摘要

在本文中,我们提出了一种基于改进的统计高斯混合模型(GMM)语音转换算法的混合系统,用于提高食管语音的识别率。该混合系统旨在通过语音转换方法来补偿食管声学特征中存在的失真信息。利用变换函数的迭代统计估计,将食管语音转换为“目标”喉部语音。由于转换后的梅尔倒谱向量直接用作我们语音识别系统的输入,因此我们没有应用语音合成器来重构转换后的语音信号。此外,通过异方差线性判别分析(HLDA)方法对特征向量进行线性变换,以在具有良好判别特性的较小空间中减小其维度。实验结果表明,与未使用HLDA和语音转换时获得的音素识别准确率相比,我们提出的系统将音素识别准确率提高了3.40%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5e8/4627987/ad632dcd6ef8/40064_2015_1428_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5e8/4627987/0fe90286c378/40064_2015_1428_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5e8/4627987/ad632dcd6ef8/40064_2015_1428_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5e8/4627987/0fe90286c378/40064_2015_1428_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c5e8/4627987/ad632dcd6ef8/40064_2015_1428_Fig2_HTML.jpg

相似文献

1
A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion.基于统计语音转换的混合系统改善食管语音识别的初步研究。
Springerplus. 2015 Oct 26;4:644. doi: 10.1186/s40064-015-1428-2. eCollection 2015.
2
E-DGAN: An Encoder-Decoder Generative Adversarial Network Based Method for Pathological to Normal Voice Conversion.E-DGAN:一种基于编解码器生成对抗网络的病理语音到正常语音转换方法。
IEEE J Biomed Health Inform. 2023 May;27(5):2489-2500. doi: 10.1109/JBHI.2023.3239551. Epub 2023 May 4.
3
A Hybrid Speech Enhancement Algorithm for Voice Assistance Application.一种用于语音助手应用的混合语音增强算法。
Sensors (Basel). 2021 Oct 23;21(21):7025. doi: 10.3390/s21217025.
4
Multidirectional regression (MDR)-based features for automatic voice disorder detection.基于多方向回归 (MDR) 的特征用于自动语音障碍检测。
J Voice. 2012 Nov;26(6):817.e19-27. doi: 10.1016/j.jvoice.2012.05.002.
5
Discrimination between pathological and normal voices using GMM-SVM approach.基于 GMM-SVM 方法的病理性嗓音与正常嗓音的区分。
J Voice. 2011 Jan;25(1):38-43. doi: 10.1016/j.jvoice.2009.08.002. Epub 2010 Feb 4.
6
Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model.基于全极点模型,通过估计听觉频谱和倒谱系数,对连续语音进行自动语音病理学检测。
J Voice. 2016 Nov;30(6):757.e7-757.e19. doi: 10.1016/j.jvoice.2015.08.010. Epub 2015 Oct 27.
7
Statistical modeling of speech Poincaré sections in combination of frequency analysis to improve speech recognition performance.联合频率分析的语音庞加莱截面的统计建模以提高语音识别性能。
Chaos. 2010 Sep;20(3):033106. doi: 10.1063/1.3463722.
8
Intra- and Inter-database Study for Arabic, English, and German Databases: Do Conventional Speech Features Detect Voice Pathology?阿拉伯语、英语和德语数据库的库内及库间研究:传统语音特征能否检测语音病理学?
J Voice. 2017 May;31(3):386.e1-386.e8. doi: 10.1016/j.jvoice.2016.09.009. Epub 2016 Oct 10.
9
Discrimination of "hot potato voice" caused by upper airway obstruction utilizing a support vector machine.
Laryngoscope. 2019 Jun;129(6):1301-1307. doi: 10.1002/lary.27584. Epub 2018 Nov 28.
10
Joint Dictionary Learning-Based Non-Negative Matrix Factorization for Voice Conversion to Improve Speech Intelligibility After Oral Surgery.基于联合字典学习的非负矩阵分解用于口腔手术后语音转换以提高语音清晰度
IEEE Trans Biomed Eng. 2017 Nov;64(11):2584-2594. doi: 10.1109/TBME.2016.2644258.

本文引用的文献

1
Reconstruction of normal sounding speech for laryngectomy patients through a modified CELP codec.通过改进的 CELP 编码为喉切除患者重建正常语音。
IEEE Trans Biomed Eng. 2010 Oct;57(10):2448-58. doi: 10.1109/TBME.2010.2053369. Epub 2010 Jun 21.
2
Pathological voice assessment.病理性嗓音评估。
Conf Proc IEEE Eng Med Biol Soc. 2006;2006:1669-73. doi: 10.1109/IEMBS.2006.259835.
3
Enhancement of electrolarynx speech based on auditory masking.基于听觉掩蔽的电子喉语音增强。
IEEE Trans Biomed Eng. 2006 May;53(5):865-74. doi: 10.1109/TBME.2006.872821.
4
Objective voice analysis for dysphonic patients: a multiparametric protocol including acoustic and aerodynamic measurements.嗓音障碍患者的客观嗓音分析:一种包括声学和空气动力学测量的多参数方案。
J Voice. 2001 Dec;15(4):529-42. doi: 10.1016/S0892-1997(01)00053-4.
5
The dysphonia severity index: an objective measure of vocal quality based on a multiparameter approach.发音障碍严重程度指数:一种基于多参数方法的嗓音质量客观测量指标。
J Speech Lang Hear Res. 2000 Jun;43(3):796-809. doi: 10.1044/jslhr.4303.796.
6
Enhancement of female esophageal and tracheoesophageal speech.女性食管发音及食管气管联合发音的改善
J Acoust Soc Am. 1995 Nov;98(5 Pt 1):2461-5. doi: 10.1121/1.413279.