Department of Electrical Engineering, Columbia University, New York, NY, United States of America. Mortimer B Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, United States of America.
J Neural Eng. 2017 Oct;14(5):056001. doi: 10.1088/1741-2552/aa7ab4. Epub 2017 Aug 4.
People who suffer from hearing impairments can find it difficult to follow a conversation in a multi-speaker environment. Current hearing aids can suppress background noise; however, there is little that can be done to help a user attend to a single conversation amongst many without knowing which speaker the user is attending to. Cognitively controlled hearing aids that use auditory attention decoding (AAD) methods are the next step in offering help. Translating the successes in AAD research to real-world applications poses a number of challenges, including the lack of access to the clean sound sources in the environment with which to compare with the neural signals. We propose a novel framework that combines single-channel speech separation algorithms with AAD.
We present an end-to-end system that (1) receives a single audio channel containing a mixture of speakers that is heard by a listener along with the listener's neural signals, (2) automatically separates the individual speakers in the mixture, (3) determines the attended speaker, and (4) amplifies the attended speaker's voice to assist the listener.
Using invasive electrophysiology recordings, we identified the regions of the auditory cortex that contribute to AAD. Given appropriate electrode locations, our system is able to decode the attention of subjects and amplify the attended speaker using only the mixed audio. Our quality assessment of the modified audio demonstrates a significant improvement in both subjective and objective speech quality measures.
Our novel framework for AAD bridges the gap between the most recent advancements in speech processing technologies and speech prosthesis research and moves us closer to the development of cognitively controlled hearable devices for the hearing impaired.
听力受损的人在多说话者环境中可能难以跟上对话。目前的助听器可以抑制背景噪音;但是,对于帮助用户在不了解用户正在关注哪个说话者的情况下专注于多个人中的单一对话,几乎无能为力。使用听觉注意力解码 (AAD) 方法的认知控制助听器是提供帮助的下一步。将 AAD 研究中的成功转化为实际应用面临许多挑战,包括缺乏与环境中干净声源的访问,无法与神经信号进行比较。我们提出了一个将单通道语音分离算法与 AAD 相结合的新框架。
我们提出了一个端到端系统,该系统 (1) 接收包含由听众听到的混合说话者的单个音频通道以及听众的神经信号,(2) 自动分离混合中的各个说话者,(3) 确定被关注的说话者,以及 (4) 放大被关注的说话者的声音以帮助听众。
使用侵入性电生理学记录,我们确定了听觉皮层中有助于 AAD 的区域。在适当的电极位置下,我们的系统能够使用混合音频解码受试者的注意力并放大被关注的说话者。我们对修改后的音频的质量评估表明,主观和客观语音质量测量都有显著提高。
我们用于 AAD 的新框架弥合了语音处理技术和语音假体研究的最新进展之间的差距,并使我们更接近为听力受损者开发认知控制的可听设备。