• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于评估长混响时间空间中非线性处理语音可懂度的新方法。

A Novel Method for Intelligibility Assessment of Nonlinearly Processed Speech in Spaces Characterized by Long Reverberation Times.

机构信息

Department of Multimedia Systems, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, 11/12 Narutowicza Street, 80-233 Gdansk, Poland.

Audio Acoustics Laboratory, Faculty of Electronics, Telecommunications and Informatics, Gdansk University of Technology, 11/12 Narutowicza Street, 80-233 Gdansk, Poland.

出版信息

Sensors (Basel). 2022 Feb 19;22(4):1641. doi: 10.3390/s22041641.

DOI:10.3390/s22041641
PMID:35214543
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8880044/
Abstract

Objective assessment of speech intelligibility is a complex task that requires taking into account a number of factors such as different perception of each speech sub-bands by the human hearing sense or different physical properties of each frequency band of a speech signal. Currently, the state-of-the-art method used for assessing the quality of speech transmission is the speech transmission index (STI). It is a standardized way of objectively measuring the quality of, e.g., an acoustical adaptation of conference rooms or public address systems. The wide use of this measure and implementation of this method on numerous measurement devices make STI a popular choice when the speech-related quality of rooms has to be estimated. However, the STI measure has a significant drawback which excludes it from some particular use cases. For instance, if one would like to enhance speech intelligibility by employing a nonlinear digital processing algorithm, the STI method is not suitable to measure the impact of such an algorithm, as it requires that the measurement signal should not be altered in a nonlinear way. Consequently, if a nonlinear speech enhancing algorithm has to be tested, the STI-a standard way of estimating speech transmission cannot be used. In this work, we would like to propose a method based on the STI method but modified in such a way that it makes it possible to employ it for the estimation of the performance of the nonlinear speech intelligibility enhancement method. The proposed approach is based upon a broadband comparison of cumulated energy of the transmitted envelope modulation and the received modulation, so we called it broadband STI (bSTI). Its credibility with regard to signals altered by the environment or nonlinear speech changed by a DSP algorithm is checked by performing a comparative analysis of ten selected impulse responses for which a baseline value of STI was known.

摘要

客观评估语音可懂度是一项复杂的任务,需要考虑许多因素,例如人类听觉感知对每个语音子带的不同感知,或语音信号每个频带的不同物理特性。目前,用于评估语音传输质量的最先进方法是语音传输指数(STI)。它是一种标准化的客观测量方法,例如会议室或公共广播系统的声学适应性。由于这种测量方法的广泛使用和在众多测量设备上的实现,使得 STI 成为估计房间与语音相关的质量时的流行选择。然而,STI 测量方法有一个显著的缺点,使其不适合某些特定的使用情况。例如,如果有人希望通过采用非线性数字处理算法来提高语音可懂度,那么 STI 方法不适合测量这种算法的影响,因为它要求测量信号不应以非线性方式改变。因此,如果要测试非线性语音增强算法,则不能使用 STI-一种估计语音传输的标准方法。在这项工作中,我们希望提出一种基于 STI 方法但经过修改的方法,使其能够用于估计非线性语音可懂度增强方法的性能。所提出的方法基于传输包络调制和接收调制的累积能量的宽带比较,因此我们称之为宽带 STI(bSTI)。通过对十个选定的脉冲响应进行比较分析来检查其对环境改变的信号或由 DSP 算法改变的非线性语音的可信度,对于这些脉冲响应,我们已知 STI 的基准值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/b5a490bb1036/sensors-22-01641-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/2ca715750898/sensors-22-01641-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/92f5e20d0457/sensors-22-01641-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/1aebe549d31a/sensors-22-01641-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/f8ea56bb3864/sensors-22-01641-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/40d36b33fea5/sensors-22-01641-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/caec4e3d1f58/sensors-22-01641-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/ab4a26edf237/sensors-22-01641-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/6751cab7739f/sensors-22-01641-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/89605db39875/sensors-22-01641-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/b5a490bb1036/sensors-22-01641-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/2ca715750898/sensors-22-01641-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/92f5e20d0457/sensors-22-01641-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/1aebe549d31a/sensors-22-01641-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/f8ea56bb3864/sensors-22-01641-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/40d36b33fea5/sensors-22-01641-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/caec4e3d1f58/sensors-22-01641-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/ab4a26edf237/sensors-22-01641-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/6751cab7739f/sensors-22-01641-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/89605db39875/sensors-22-01641-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1d8/8880044/b5a490bb1036/sensors-22-01641-g010.jpg

相似文献

1
A Novel Method for Intelligibility Assessment of Nonlinearly Processed Speech in Spaces Characterized by Long Reverberation Times.一种用于评估长混响时间空间中非线性处理语音可懂度的新方法。
Sensors (Basel). 2022 Feb 19;22(4):1641. doi: 10.3390/s22041641.
2
Predicting the intelligibility of vocoded speech.语音编码语音可懂度预测。
Ear Hear. 2011 May-Jun;32(3):331-8. doi: 10.1097/AUD.0b013e3181ff3515.
3
Perceived listening effort and speech intelligibility in reverberation and noise for hearing-impaired listeners.听力受损听众在混响和噪声环境中的感知聆听努力与言语可懂度
Int J Audiol. 2016 Dec;55(12):738-747. doi: 10.1080/14992027.2016.1219774. Epub 2016 Sep 14.
4
Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions.用于混响条件下语音的声学感知分析。
Sensors (Basel). 2021 Sep 21;21(18):6320. doi: 10.3390/s21186320.
5
Evaluation of a noise reduction method--comparison between observed scores and scores predicted from STI.一种降噪方法的评估——观察分数与根据言语传输指数(STI)预测的分数之间的比较
Scand Audiol Suppl. 1993;38:50-5.
6
Listening effort and speech intelligibility in listening situations affected by noise and reverberation.在受噪声和混响影响的聆听情境中的聆听努力与言语可懂度。
J Acoust Soc Am. 2014 Nov;136(5):2642-53. doi: 10.1121/1.4897398.
7
Analysis of speech-based Speech Transmission Index methods with implications for nonlinear operations.基于语音的语音传输指数方法分析及其对非线性操作的影响。
J Acoust Soc Am. 2004 Dec;116(6):3679-89. doi: 10.1121/1.1804628.
8
Comparison of a short-time speech-based intelligibility metric to the speech transmission index and intelligibility data.基于短时语音可懂度的度量与语音传输指数和可懂度数据的比较。
J Acoust Soc Am. 2013 Nov;134(5):3818-27. doi: 10.1121/1.4821216.
9
A dissociation between speech understanding and perceived reverberation.言语理解与感知混响之间的分离。
Hear Res. 2019 Aug;379:52-58. doi: 10.1016/j.heares.2019.04.015. Epub 2019 Apr 26.
10
Intelligibility of conversational and clear speech in noise and reverberation for listeners with normal and impaired hearing.听力正常和受损的听众在噪声和混响环境下对会话语音和清晰语音的可懂度。
J Acoust Soc Am. 1994 Mar;95(3):1581-92. doi: 10.1121/1.408545.

引用本文的文献

1
Detecting Lombard Speech Using Deep Learning Approach.使用深度学习方法检测 Lombard 语音。
Sensors (Basel). 2022 Dec 28;23(1):315. doi: 10.3390/s23010315.
2
Analytics and Applications of Audio and Image Sensing Techniques.音频和图像感应技术的分析与应用。
Sensors (Basel). 2022 Nov 3;22(21):8443. doi: 10.3390/s22218443.

本文引用的文献

1
Acoustic Sensing Analytics Applied to Speech in Reverberation Conditions.用于混响条件下语音的声学感知分析。
Sensors (Basel). 2021 Sep 21;21(18):6320. doi: 10.3390/s21186320.
2
Intelligibility and Clarity of Reverberant Speech: Effects of Wide Dynamic Range Compression Release Time and Working Memory.混响语音的可懂度和清晰度:宽动态范围压缩释放时间和工作记忆的影响
J Speech Lang Hear Res. 2016 Dec 1;59(6):1543-1554. doi: 10.1044/2016_JSLHR-H-15-0371.
3
Comparison of a short-time speech-based intelligibility metric to the speech transmission index and intelligibility data.
基于短时语音可懂度的度量与语音传输指数和可懂度数据的比较。
J Acoust Soc Am. 2013 Nov;134(5):3818-27. doi: 10.1121/1.4821216.
4
Objective measures for predicting speech intelligibility in noisy conditions based on new band-importance functions.基于新的频段重要性函数预测噪声环境下言语可懂度的客观测量方法。
J Acoust Soc Am. 2009 May;125(5):3387-405. doi: 10.1121/1.3097493.
5
The intelligibility of rectangular speech-waves.
Am J Psychol. 1948 Jan;61(1):1-20.
6
The combined effects of reverberation and nonstationary noise on sentence intelligibility.混响和非平稳噪声对句子可懂度的综合影响。
J Acoust Soc Am. 2008 Aug;124(2):1269-77. doi: 10.1121/1.2945153.
7
Binaural intelligibility prediction based on the speech transmission index.
J Acoust Soc Am. 2008 Jun;123(6):4514-23. doi: 10.1121/1.2905245.
8
Speech segregation in rooms: monaural, binaural, and interacting effects of reverberation on target and interferer.室内语音分离:混响对目标音和干扰音的单耳、双耳及交互效应。
J Acoust Soc Am. 2008 Apr;123(4):2237-48. doi: 10.1121/1.2871943.
9
Extended speech intelligibility index for the prediction of the speech reception threshold in fluctuating noise.用于预测波动噪声中言语接受阈的扩展言语可懂度指数
J Acoust Soc Am. 2006 Dec;120(6):3988-97. doi: 10.1121/1.2358008.
10
Coherence and the speech intelligibility index.连贯性与言语可懂度指数。
J Acoust Soc Am. 2005 Apr;117(4 Pt 1):2224-37. doi: 10.1121/1.1862575.