利用语音和歌唱的高频进行性别及发声方式辨别。

Gender and vocal production mode discrimination using the high frequencies for speech and singing.

作者信息

Monson Brian B, Lotto Andrew J, Story Brad H

机构信息

Department of Pediatric Newborn Medicine, Brigham and Women's Hospital, Harvard Medical School Boston, MA, USA.

Speech, Language, and Hearing Sciences, University of Arizona Tucson, AZ, USA.

出版信息

Front Psychol. 2014 Oct 30;5:1239. doi: 10.3389/fpsyg.2014.01239. eCollection 2014.

DOI:10.3389/fpsyg.2014.01239

PMID:25400613

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4214223/

Abstract

Humans routinely produce acoustical energy at frequencies above 6 kHz during vocalization, but this frequency range is often not represented in communication devices and speech perception research. Recent advancements toward high-definition (HD) voice and extended bandwidth hearing aids have increased the interest in the high frequencies. The potential perceptual information provided by high-frequency energy (HFE) is not well characterized. We found that humans can accomplish tasks of gender discrimination and vocal production mode discrimination (speech vs. singing) when presented with acoustic stimuli containing only HFE at both amplified and normal levels. Performance in these tasks was robust in the presence of low-frequency masking noise. No substantial learning effect was observed. Listeners also were able to identify the sung and spoken text (excerpts from "The Star-Spangled Banner") with very few exposures. These results add to the increasing evidence that the high frequencies provide at least redundant information about the vocal signal, suggesting that its representation in communication devices (e.g., cell phones, hearing aids, and cochlear implants) and speech/voice synthesizers could improve these devices and benefit normal-hearing and hearing-impaired listeners.

摘要

人类在发声过程中经常会产生频率高于6千赫兹的声能，但这个频率范围在通信设备和语音感知研究中往往没有体现。近期高清语音和扩展带宽助听器方面的进展增加了人们对高频的兴趣。高频能量（HFE）所提供的潜在感知信息尚未得到充分表征。我们发现，当向人类呈现仅包含放大和正常水平的HFE的声学刺激时，他们能够完成性别辨别和发声模式辨别（语音与歌唱）任务。在存在低频掩蔽噪声的情况下，这些任务的表现依然稳健。未观察到显著的学习效应。听众在很少的接触次数下也能够识别出演唱和朗读的文本（《星条旗之歌》的节选）。这些结果进一步证明了高频至少能提供关于语音信号的冗余信息，这表明在通信设备（如手机、助听器和人工耳蜗）以及语音/语音合成器中对高频进行呈现，可能会改进这些设备，并使听力正常和听力受损的听众受益。