Suppr超能文献

人工耳蜗编码与非编码合成女性歌声的多维音色空间

Multidimensional Timbre Spaces of Cochlear Implant Vocoded and Non-vocoded Synthetic Female Singing Voices.

作者信息

Erickson Molly L, Faulkner Katie, Johnstone Patti M, Hedrick Mark S, Stone Taylor

机构信息

Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN, United States.

出版信息

Front Neurosci. 2020 Apr 7;14:307. doi: 10.3389/fnins.2020.00307. eCollection 2020.

Abstract

Many post-lingually deafened cochlear implant (CI) users report that they no longer enjoy listening to music, which could possibly contribute to a perceived reduction in quality of life. One aspect of music perception, vocal timbre perception, may be difficult for CI users because they may not be able to use the same timbral cues available to normal hearing listeners. Vocal tract resonance frequencies have been shown to provide perceptual cues to voice categories such as baritone, tenor, mezzo-soprano, and soprano, while changes in glottal source spectral slope are believed to be related to perception of vocal quality dimensions such as vs. As a first step toward understanding vocal timbre perception in CI users, we employed an 8-channel noise-band vocoder to test how vocoding can alter the timbral perception of female synthetic sung vowels across pitches. Non-vocoded and vocoded stimuli were synthesized with vibrato using 3 excitation source spectral slopes and 3 vocal tract transfer functions (mezzo-soprano, intermediate, soprano) at the pitches C4, B4, and F5. Six multi-dimensional scaling experiments were conducted: C4 not vocoded, C4 vocoded, B4 not vocoded, B4 vocoded, F5 not vocoded, and F5 vocoded. At the pitch C4, for both non-vocoded and vocoded conditions, dimension 1 grouped stimuli according to voice category and was most strongly predicted by spectral centroid from 0 to 2 kHz. While dimension 2 grouped stimuli according to excitation source spectral slope, it was organized slightly differently and predicted by different acoustic parameters in the non-vocoded and vocoded conditions. For pitches B4 and F5 spectral centroid from 0 to 2 kHz most strongly predicted dimension 1. However, while dimension 1 separated all 3 voice categories in the vocoded condition, dimension 1 only separated the soprano stimuli from the intermediate and mezzo-soprano stimuli in the non-vocoded condition. While it is unclear how these results predict timbre perception in CI listeners, in general, these results suggest that perhaps some aspects of vocal timbre may remain.

摘要

许多语后聋人工耳蜗(CI)使用者表示,他们不再喜欢听音乐,这可能会导致生活质量明显下降。音乐感知的一个方面,即人声音色感知,对CI使用者来说可能很困难,因为他们可能无法利用正常听力者所具备的相同音色线索。已证明声道共振频率可为诸如男中音、男高音、女中音和女高音等嗓音类别提供感知线索,而声门源频谱斜率的变化被认为与诸如[此处原文缺失部分内容]等嗓音质量维度的感知有关。作为理解CI使用者人声音色感知的第一步,我们使用一个8通道噪声带声码器来测试声码转换如何改变不同音高的女性合成演唱元音的音色感知。使用3种激励源频谱斜率和3种声道传递函数(女中音、中间型、女高音)在音高C4、B4和F5上合成了带有颤音的未经过声码转换和经过声码转换的刺激音。进行了6次多维标度实验:C4未经过声码转换、C4经过声码转换、B4未经过声码转换、B4经过声码转换、F5未经过声码转换和F5经过声码转换。在音高C4上,对于未经过声码转换和经过声码转换的两种情况,维度1根据嗓音类别对刺激音进行分组,并且最强烈地由0至2千赫的频谱质心预测。而维度2根据激励源频谱斜率对刺激音进行分组,其组织方式略有不同,并且在未经过声码转换和经过声码转换的情况下由不同的声学参数预测。对于音高B4和F5,0至2千赫的频谱质心最强烈地预测维度1。然而,虽然维度1在经过声码转换的情况下将所有3种嗓音类别区分开,但在未经过声码转换的情况下,维度1仅将女高音刺激音与中间型和女中音刺激音区分开。虽然尚不清楚这些结果如何预测CI聆听者的音色感知,但总体而言,这些结果表明人声音色的某些方面可能仍然存在。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f8bb/7179674/826f5e9eab3f/fnins-14-00307-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验