不同音乐预处理方法在人工耳蜗植入者中的主观评估。

A subjective evaluation of different music preprocessing approaches in cochlear implant listeners.

机构信息

Institute of Communication Acoustics, Ruhr-Universität Bochum, Bochum, Germany.

Department of Otorhinolaringology, Head and Neck Surgery, St. Elisabeth-Hospital, Ruhr-Universität Bochum, Bochum, Germany.

出版信息

J Acoust Soc Am. 2023 Feb;153(2):1307. doi: 10.1121/10.0017249.

DOI:10.1121/10.0017249

PMID:36859137

Abstract

Cochlear implants (CIs) can partially restore speech perception to relatively high levels in listeners with moderate to profound hearing loss. However, for most CI listeners, the perception and enjoyment of music remains notably poor. Since a number of technical and physiological restrictions of current implant designs cannot be easily overcome, a number of preprocessing methods for music signals have been proposed recently. They aim to emphasize the leading voice and rhythmic elements and to reduce their spectral complexity. In this study, CI listeners evaluated five remixing approaches in comparison to unprocessed signals. To identify potential explaining factors of CI preference ratings, different signal quality criteria of the processed signals were additionally assessed by normal-hearing listeners. Additional factors were investigated based on instrumental signal-level features. For three preprocessing methods, a significant improvement over the unprocessed reference was found. Especially, two deep neural network-based remix strategies proved to enhance music perception in CI listeners. These strategies provide remixes of the respective harmonic and percussive signal components of the four source stems "vocals," "bass," "drums," and "other accompaniment." Moreover, the results demonstrate that CI listeners prefer an attenuation of sustained components of drum source signals.

摘要

人工耳蜗可以在中度至重度听力损失的患者中将言语感知部分恢复到较高水平。然而，对于大多数人工耳蜗使用者来说，对音乐的感知和享受仍然明显较差。由于当前植入物设计的一些技术和生理限制难以轻易克服，因此最近提出了许多用于音乐信号的预处理方法。它们旨在强调主导声音和节奏元素，并降低其频谱复杂性。在这项研究中，CI 听众将五种混音方法与未处理的信号进行了比较。为了确定 CI 偏好评分的潜在解释因素，正常听力听众还评估了处理后信号的不同信号质量标准。根据乐器信号级别的特征，研究了其他因素。对于三种预处理方法，与未处理的参考相比，发现有明显的改善。特别是两种基于深度神经网络的混音策略被证明可以增强 CI 听众的音乐感知。这些策略提供了四个源音轨“人声”，“贝斯”，“鼓”和“其他伴奏”的各自谐波和打击乐信号分量的混音。此外，结果表明 CI 听众更喜欢衰减鼓源信号的持续成分。