IEEE J Biomed Health Inform. 2021 Oct;25(10):3955-3966. doi: 10.1109/JBHI.2021.3075631. Epub 2021 Oct 5.
When multiple speakers talk simultaneously, a hearing device cannot identify which of these speakers the listener intends to attend to. Auditory attention decoding (AAD) algorithms can provide this information by, for example, reconstructing the attended speech envelope from electroencephalography (EEG) signals. However, these stimulus reconstruction decoders are traditionally trained in a supervised manner, requiring a dedicated training stage during which the attended speaker is known. Pre-trained subject-independent decoders alleviate the need of having such a per-user training stage but perform substantially worse than supervised subject-specific decoders that are tailored to the user. This motivates the development of a new unsupervised self-adapting training/updating procedure for a subject-specific decoder, which iteratively improves itself on unlabeled EEG data using its own predicted labels. This iterative updating procedure enables a self-leveraging effect, of which we provide a mathematical analysis that reveals the underlying mechanics. The proposed unsupervised algorithm, starting from a random decoder, results in a decoder that outperforms a supervised subject-independent decoder. Starting from a subject-independent decoder, the unsupervised algorithm even closely approximates the performance of a supervised subject-specific decoder. The developed unsupervised AAD algorithm thus combines the two advantages of a supervised subject-specific and subject-independent decoder: it approximates the performance of the former while retaining the 'plug-and-play' character of the latter. As the proposed algorithm can be used to automatically adapt to new users, as well as over time when new EEG data is being recorded, it contributes to more practical neuro-steered hearing devices.
当多个说话者同时说话时,听力设备无法识别听众打算关注哪个说话者。听觉注意解码 (AAD) 算法可以通过例如从脑电图 (EEG) 信号重建被关注的语音包络来提供此信息。然而,这些刺激重建解码器传统上是通过监督方式进行训练的,这需要在专门的训练阶段中知道被关注的说话者。预训练的独立于受试者的解码器减轻了对每个用户进行这种训练阶段的需求,但性能明显逊于针对用户量身定制的监督特定于受试者的解码器。这促使开发了一种新的无监督自适应训练/更新特定于受试者的解码器的程序,该程序使用其自身的预测标签在未标记的 EEG 数据上迭代地改进自身。这种迭代更新过程实现了自我提升效应,我们提供了一个数学分析,揭示了其背后的机理。从随机解码器开始,所提出的无监督算法会产生一个优于监督独立于受试者的解码器的解码器。从独立于受试者的解码器开始,无监督算法甚至可以接近监督特定于受试者的解码器的性能。因此,所开发的无监督 AAD 算法结合了监督特定于受试者和独立于受试者的解码器的两个优势:它近似于前者的性能,同时保留了后者的“即插即用”特性。由于所提出的算法可用于自动适应新用户,以及随着时间的推移记录新的 EEG 数据,因此它有助于开发更实用的神经引导听力设备。