He Tianyu, Wei Mingyi, Wang Ruicong, Wang Renzhi, Du Shiwei, Cai Siqi, Tao Wei, Li Haizhou
School of Data Science, Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, Guangdong, 518172, P. R. China.
Department of Neurosurgery, South China Hospital, Medical School, Shenzhen University, Shenzhen, 518116, P. R. China.
Sci Data. 2025 Apr 19;12(1):657. doi: 10.1038/s41597-025-04741-2.
Speech BCIs based on implanted electrodes hold significant promise for enhancing spoken communication through high temporal resolution and invasive neural sensing. Despite the potential, acquiring such data is challenging due to its invasive nature, and publicly available datasets, particularly for tonal languages, are limited. In this study, we introduce VocalMind, a stereotactic electroencephalography (sEEG) dataset focused on Mandarin Chinese, a tonal language. This dataset includes sEEG-speech parallel recordings from three distinct speech modes, namely vocalized speech, mimed speech, and imagined speech, at both word and sentence levels, totaling over one hour of intracranial neural recordings related to speech production. This paper also presents a baseline model as the reference model for future studies, at the same time, ensuring the integrity of the dataset. The diversity of tasks and the substantial data volume provide a valuable resource for developing advanced algorithms for speech decoding, thereby advancing BCI research for spoken communication.
基于植入电极的语音脑机接口在通过高时间分辨率和侵入性神经传感增强口语交流方面具有巨大潜力。尽管有这种潜力,但由于其侵入性,获取此类数据具有挑战性,并且公开可用的数据集,特别是针对声调语言的数据集非常有限。在本研究中,我们引入了VocalMind,这是一个专注于汉语(一种声调语言)的立体定向脑电图(sEEG)数据集。该数据集包括来自三种不同语音模式(即发声语音、哑剧语音和想象语音)在单词和句子层面的sEEG-语音并行记录,总计超过一小时与语音产生相关的颅内神经记录。本文还提出了一个基线模型作为未来研究的参考模型,同时确保数据集的完整性。任务的多样性和大量的数据量为开发用于语音解码的先进算法提供了宝贵资源,从而推动用于口语交流的脑机接口研究。