Department of Biology, University of Washington, Seattle, WA 98195, United States of America.
eScience Institute, University of Washington, Seattle, WA 98195, United States of America.
J Neural Eng. 2022 Aug 10;19(4). doi: 10.1088/1741-2552/ac857c.
Recent advances in neural decoding have accelerated the development of brain-computer interfaces aimed at assisting users with everyday tasks such as speaking, walking, and manipulating objects. However, current approaches for training neural decoders commonly require large quantities of labeled data, which can be laborious or infeasible to obtain in real-world settings. Alternatively, self-supervised models that share self-generated pseudo-labels between two data streams have shown exceptional performance on unlabeled audio and video data, but it remains unclear how well they extend to neural decoding.We learn neural decoders without labels by leveraging multiple simultaneously recorded data streams, including neural, kinematic, and physiological signals. Specifically, we apply cross-modal, self-supervised deep clustering to train decoders that can classify movements from brain recordings. After training, we then isolate the decoders for each input data stream and compare the accuracy of decoders trained using cross-modal deep clustering against supervised and unimodal, self-supervised models.We find that sharing pseudo-labels between two data streams during training substantially increases decoding performance compared to unimodal, self-supervised models, with accuracies approaching those of supervised decoders trained on labeled data. Next, we extend cross-modal decoder training to three or more modalities, achieving state-of-the-art neural decoding accuracy that matches or slightly exceeds the performance of supervised models.We demonstrate that cross-modal, self-supervised decoding can be applied to train neural decoders when few or no labels are available and extend the cross-modal framework to share information among three or more data streams, further improving self-supervised training.
近年来,神经解码技术的进步加速了脑机接口的发展,这些接口旨在帮助用户完成日常任务,如说话、行走和操纵物体。然而,目前训练神经解码器的方法通常需要大量的标记数据,而在实际环境中,这些数据可能很难获取或不切实际。另一方面,在无标签的音频和视频数据上表现出色的自监督模型通过两个数据流之间共享自生成的伪标签,但是,我们还不清楚它们在神经解码方面的扩展效果如何。我们通过利用多个同时记录的数据流(包括神经、运动和生理信号)来学习无标签的神经解码器。具体来说,我们应用跨模态、自监督的深度聚类来训练可以从大脑记录中分类运动的解码器。在训练之后,我们分离每个输入数据流的解码器,并比较使用跨模态深度聚类训练的解码器与监督和单模态、自监督模型的准确性。我们发现,在训练期间在两个数据流之间共享伪标签可显著提高解码性能,与单模态、自监督模型相比,其准确性接近在有标签数据上训练的监督解码器。接下来,我们将跨模态解码器的训练扩展到三个或更多模态,实现了与监督模型相当或略超过的最先进的神经解码精度。我们证明了当可用的标签很少或没有标签时,可以应用跨模态、自监督解码来训练神经解码器,并将跨模态框架扩展到三个或更多数据流之间共享信息,从而进一步改善自监督训练。