• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

时频二元掩蔽误差中的结构及其对语音可懂度的影响。

Structure in time-frequency binary masking errors and its impact on speech intelligibility.

作者信息

Kressner Abigail A, Rozell Christopher J

机构信息

School of Electrical and Computer Engineering, 777 Atlantic Drive Northwest, Georgia Institute of Technology, Atlanta, Georgia 30332.

出版信息

J Acoust Soc Am. 2015 Apr;137(4):2025-35. doi: 10.1121/1.4916271.

DOI:10.1121/1.4916271
PMID:25920853
Abstract

Although requiring prior knowledge makes the ideal binary mask an impractical algorithm, substantial increases in measured intelligibility make it a desirable benchmark. While this benchmark has been studied extensively, many questions remain about the factors that influence the intelligibility of binary-masked speech with non-ideal masks. To date, researchers have used primarily uniformly random, uncorrelated mask errors and independently presented error types (i.e., false positives and negatives) to characterize the influence of estimation errors on intelligibility. However, practical estimation algorithms produce masks that contain errors of both types and with non-trivial amounts of structure. This paper introduces an investigation framework for binary masks and presents listener studies that use this framework to illustrate how interactions between error types and structure affect intelligibility. First, this study demonstrates that clustering (i.e., a form of structure) of mask errors reduces intelligibility. Furthermore, while previous research has suggested that false positives are more detrimental to intelligibility than false negatives, this study indicates that false negatives can be equally detrimental to intelligibility when they contain structure or when both error types are present. Finally, this study shows that listeners tolerate fewer mask errors when both types of errors are present, especially when the errors contain structure.

摘要

尽管需要先验知识使得理想二值掩蔽成为一种不切实际的算法,但测量得到的可懂度大幅提高使其成为一个理想的基准。虽然对这个基准已经进行了广泛研究,但对于影响使用非理想掩蔽的二值掩蔽语音可懂度的因素,仍存在许多问题。迄今为止,研究人员主要使用均匀随机、不相关的掩蔽误差以及独立呈现的误差类型(即误报和漏报)来表征估计误差对可懂度的影响。然而,实际的估计算法产生的掩蔽包含这两种类型的误差,且具有相当数量的结构。本文介绍了一种针对二值掩蔽的研究框架,并展示了使用该框架的听众研究,以说明误差类型与结构之间的相互作用如何影响可懂度。首先,本研究表明掩蔽误差的聚类(即一种结构形式)会降低可懂度。此外,虽然先前的研究表明误报对可懂度的损害比漏报更大,但本研究表明,当漏报包含结构或两种误差类型都存在时,漏报对可懂度的损害可能同样严重。最后,本研究表明,当两种类型的误差都存在时,尤其是当误差包含结构时,听众能够容忍的掩蔽误差更少。

相似文献

1
Structure in time-frequency binary masking errors and its impact on speech intelligibility.时频二元掩蔽误差中的结构及其对语音可懂度的影响。
J Acoust Soc Am. 2015 Apr;137(4):2025-35. doi: 10.1121/1.4916271.
2
Intelligibility of reverberant noisy speech with ideal binary masking.用理想二值掩蔽评估混响噪声语音的可懂度。
J Acoust Soc Am. 2011 Oct;130(4):2153-61. doi: 10.1121/1.3631668.
3
Outcome measures based on classification performance fail to predict the intelligibility of binary-masked speech.基于分类性能的结果指标无法预测二元掩蔽语音的可懂度。
J Acoust Soc Am. 2016 Jun;139(6):3033. doi: 10.1121/1.4952439.
4
Improvement of intelligibility of ideal binary-masked noisy speech by adding background noise.添加背景噪声可提高理想二值掩蔽噪声语音的可懂度。
J Acoust Soc Am. 2011 Apr;129(4):2227-36. doi: 10.1121/1.3559707.
5
Cochlear implant speech intelligibility outcomes with structured and unstructured binary mask errors.具有结构化和非结构化二进制掩码误差的人工耳蜗语音清晰度结果
J Acoust Soc Am. 2016 Feb;139(2):800-10. doi: 10.1121/1.4941567.
6
Evaluation of the importance of time-frequency contributions to speech intelligibility in noise.评估时频因素对噪声环境下言语可懂度的重要性。
J Acoust Soc Am. 2014 May;135(5):3007-16. doi: 10.1121/1.4869088.
7
Perceptual effects of noise reduction by time-frequency masking of noisy speech.噪声语音的时频掩蔽降噪的感知效果。
J Acoust Soc Am. 2012 Oct;132(4):2690-9. doi: 10.1121/1.4747006.
8
Role of mask pattern in intelligibility of ideal binary-masked noisy speech.掩码模式在理想二元掩码噪声语音可懂度中的作用。
J Acoust Soc Am. 2009 Sep;126(3):1415-26. doi: 10.1121/1.3179673.
9
Speech intelligibility in reverberation with ideal binary masking: effects of early reflections and signal-to-noise ratio threshold.混响环境下理想二值掩蔽对言语可懂度的影响:早期反射声和信噪比阈的作用。
J Acoust Soc Am. 2013 Mar;133(3):1707-17. doi: 10.1121/1.4789895.
10
Speech intelligibility in background noise with ideal binary time-frequency masking.基于理想二元时频掩蔽的背景噪声下语音清晰度
J Acoust Soc Am. 2009 Apr;125(4):2336-47. doi: 10.1121/1.3083233.

引用本文的文献

1
Objective intelligibility measurement of reverberant vocoded speech for normal-hearing listeners: Towards facilitating the development of speech enhancement algorithms for cochlear implants.为正常听力听众测量混响语音编码语音的客观可懂度:促进人工耳蜗语音增强算法的发展。
J Acoust Soc Am. 2024 Mar 1;155(3):2151-2168. doi: 10.1121/10.0025285.
2
Application of a Graphical Model to Investigate the Utility of Cross-channel Information for Mitigating Reverberation in Cochlear Implants.应用图形模型研究跨通道信息在减轻人工耳蜗混响方面的效用
Proc Int Conf Mach Learn Appl. 2018 Dec;2018:847-852. doi: 10.1109/ICMLA.2018.00136. Epub 2019 Jan 17.
3
A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments.
双耳分组模型在多说话人环境下预测言语可懂度。
Trends Hear. 2016 Oct 3;20:2331216516669919. doi: 10.1177/2331216516669919.