Suppr超能文献

比较耳蜗尺度熵和语音水平的感知贡献。

Comparing the perceptual contributions of cochlear-scaled entropy and speech level.

作者信息

Shu Yilai, Feng Xiao-Xing, Chen Fei

机构信息

Department of Otolaryngology-Head and Neck Surgery, Eye and ENT Hospital, Shanghai Medical College, Fudan University, Shanghai 200031, China

Shenzhen Micro & Nano Research Institute of IC and System Application, Xili, Nanshan District, Shenzhen 518055, China

出版信息

J Acoust Soc Am. 2016 Dec;140(6):EL517. doi: 10.1121/1.4971879.

Abstract

Cochlear-scaled entropy (CSE) has been suggested to be a reliable predictor of speech intelligibility. Previous studies showed that speech segments with high root-mean-square (RMS) levels (H-levels) contained primarily vowels, which carry important information for speech recognition. The present work compared the contributions of high-CSE (H-entropy) and H-level segments to speech intelligibility. The natural speech was edited to generate two types of noise-replaced stimuli, which preserved the same percentages of largest CSE segments and highest RMS-level segments, and played to normal-hearing listeners in a recognition experiment. Experimental results showed that the nature of the noise-replaced stimulus, H-entropy and H-level, made a small difference in intelligibility performance. CSEs and RMS levels showed a moderately high correlation (r = 0.79), suggesting that many speech segments may have both large CSEs and high RMS levels, which might account partially for the small intelligibility difference between the two types of stimuli. In addition, the vowel duration proportion differed between H-entropy and H-level segments of the same length, suggesting that vowels play different roles in contributing to the intelligibility of H-entropy and H-level stimuli.

摘要

耳蜗尺度熵(CSE)已被认为是言语可懂度的可靠预测指标。先前的研究表明,均方根(RMS)水平较高(高电平)的语音段主要包含元音,而元音携带了语音识别的重要信息。本研究比较了高CSE(高熵)和高电平语音段对言语可懂度的贡献。对自然语音进行编辑,生成两种类型的噪声替换刺激,这两种刺激保留了相同比例的最大CSE段和最高RMS电平段,并在识别实验中播放给听力正常的听众。实验结果表明,噪声替换刺激的性质、高熵和高电平,在可懂度表现上存在微小差异。CSE和RMS电平显示出中等程度的高相关性(r = 0.79),这表明许多语音段可能同时具有较大的CSE和较高的RMS电平,这可能部分解释了两种类型刺激之间可懂度差异较小的原因。此外,相同长度的高熵和高电平语音段的元音时长比例不同,这表明元音在高熵和高电平刺激的可懂度贡献中发挥着不同的作用。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验