• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于神经信号的自动语音识别:综述

Automatic Speech Recognition from Neural Signals: A Focused Review.

作者信息

Herff Christian, Schultz Tanja

机构信息

Cognitive Systems Lab, Department for Mathematics and Computer Science, University of Bremen Bremen, Germany.

出版信息

Front Neurosci. 2016 Sep 27;10:429. doi: 10.3389/fnins.2016.00429. eCollection 2016.

DOI:10.3389/fnins.2016.00429
PMID:27729844
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5037201/
Abstract

Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e., patients suffering from locked-in syndrome). For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people. This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography). As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the system.

摘要

语音接口已被广泛接受,如今已集成到各种现实生活应用和设备中。它们已成为我们日常生活的一部分。然而,语音接口需要具备产生清晰可懂语音的能力,但由于环境嘈杂、会干扰旁人或无法发出语音(即患有闭锁综合征的患者),这可能无法实现。出于这些原因,非常希望不必说话,只需想象自己说出单词或句子即可。基于想象语音的接口将实现快速自然的交流,而无需可听语音,并能让原本无法发声的人发出声音。这篇重点综述分析了不同脑成像技术通过应用自动语音识别技术从神经信号中识别语音的潜力。我们认为,基于代谢过程的模态,如功能近红外光谱和功能磁共振成像,由于时间分辨率低,不太适合从神经信号中进行自动语音识别,但对于研究语音过程中涉及的潜在神经机制非常有用。相比之下,电生理活动速度足够快,能够捕捉语音过程,因此更适合自动语音识别。我们的实验结果表明了这些信号在从神经数据中进行语音识别方面的潜力,重点是侵入性测量的脑活动(皮层脑电图)。作为从神经信号中使用自动语音识别技术的第一个例子,我们讨论了该系统。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a1d/5037201/c11f0540dd57/fnins-10-00429-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a1d/5037201/665f4270b2df/fnins-10-00429-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a1d/5037201/c11f0540dd57/fnins-10-00429-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a1d/5037201/665f4270b2df/fnins-10-00429-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a1d/5037201/c11f0540dd57/fnins-10-00429-g0002.jpg

相似文献

1
Automatic Speech Recognition from Neural Signals: A Focused Review.基于神经信号的自动语音识别:综述
Front Neurosci. 2016 Sep 27;10:429. doi: 10.3389/fnins.2016.00429. eCollection 2016.
2
The Potential for a Speech Brain-Computer Interface Using Chronic Electrocorticography.利用慢性皮层脑电图实现语音脑-机接口的潜力
Neurotherapeutics. 2019 Jan;16(1):144-165. doi: 10.1007/s13311-018-00692-2.
3
Using Automatic Speech Recognition to Measure the Intelligibility of Speech Synthesized from Brain Signals.利用自动语音识别技术测量从脑信号合成的语音的可懂度。
Int IEEE EMBS Conf Neural Eng. 2023 Apr;2023. doi: 10.1109/ner52421.2023.10123751. Epub 2023 May 19.
4
Iterative alignment discovery of speech-associated neural activity.语音相关神经活动的迭代对齐发现。
J Neural Eng. 2024 Aug 28;21(4):046056. doi: 10.1088/1741-2552/ad663c.
5
Brain-to-text: decoding spoken phrases from phone representations in the brain.脑到文本:从大脑中的语音表征解码口语短语。
Front Neurosci. 2015 Jun 12;9:217. doi: 10.3389/fnins.2015.00217. eCollection 2015.
6
Generating Natural, Intelligible Speech From Brain Activity in Motor, Premotor, and Inferior Frontal Cortices.从运动皮层、运动前区皮层和额下回的脑活动中生成自然、可理解的语音。
Front Neurosci. 2019 Nov 22;13:1267. doi: 10.3389/fnins.2019.01267. eCollection 2019.
7
Online speech synthesis using a chronically implanted brain-computer interface in an individual with ALS.使用慢性植入脑-机接口对肌萎缩性侧索硬化症患者进行在线语音合成。
Sci Rep. 2024 Apr 26;14(1):9617. doi: 10.1038/s41598-024-60277-2.
8
Speaking mode recognition from functional Near Infrared Spectroscopy.基于功能近红外光谱的说话模式识别。
Annu Int Conf IEEE Eng Med Biol Soc. 2012;2012:1715-8. doi: 10.1109/EMBC.2012.6346279.
9
Automatic speech recognition (ASR) and its use as a tool for assessment or therapy of voice, speech, and language disorders.自动语音识别(ASR)及其作为评估或治疗嗓音、言语和语言障碍的工具的应用。
Logoped Phoniatr Vocol. 2009;34(2):91-6. doi: 10.1080/14015430802657216.
10
Machine learning based sample extraction for automatic speech recognition using dialectal Assamese speech.基于机器学习的方言阿萨姆语语音自动识别样本提取。
Neural Netw. 2016 Jun;78:97-111. doi: 10.1016/j.neunet.2015.12.010. Epub 2015 Dec 30.

引用本文的文献

1
Brain activation during vocal motor imagery: a pilot functional near-infrared spectroscopy (fNIRS) study.发声运动想象过程中的脑激活:一项功能性近红外光谱(fNIRS)初步研究
Exp Brain Res. 2025 Jun 20;243(7):177. doi: 10.1007/s00221-025-07125-5.
2
VocalMind: A Stereotactic EEG Dataset for Vocalized, Mimed, and Imagined Speech in Tonal Language.VocalMind:一个用于有声、哑剧和想象中的声调语言语音的立体定向脑电图数据集。
Sci Data. 2025 Apr 19;12(1):657. doi: 10.1038/s41597-025-04741-2.
3
Using data from cue presentations results in grossly overestimating semantic BCI performance.

本文引用的文献

1
Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates.聚类失效:为何功能磁共振成像在空间范围推断上存在过高的假阳性率。
Proc Natl Acad Sci U S A. 2016 Jul 12;113(28):7900-5. doi: 10.1073/pnas.1602413113. Epub 2016 Jun 28.
2
Decoding of Covert Vowel Articulation Using Electroencephalography Cortical Currents.利用脑电图皮层电流对隐蔽元音发音进行解码
Front Neurosci. 2016 May 3;10:175. doi: 10.3389/fnins.2016.00175. eCollection 2016.
3
The auditory representation of speech sounds in human motor cortex.
使用提示呈现数据会导致语义脑机接口性能的严重高估。
Sci Rep. 2024 Nov 14;14(1):28003. doi: 10.1038/s41598-024-79309-y.
4
Real-time detection of spoken speech from unlabeled ECoG signals: A pilot study with an ALS participant.从未标记的脑皮层电图信号中实时检测语音:对一名肌萎缩侧索硬化症患者的初步研究。
medRxiv. 2024 Sep 22:2024.09.18.24313755. doi: 10.1101/2024.09.18.24313755.
5
Neural Decoding of Spontaneous Overt and Intended Speech.自发性言语的神经解码:显性和意图性言语。
J Speech Lang Hear Res. 2024 Nov 7;67(11):4216-4225. doi: 10.1044/2024_JSLHR-24-00046. Epub 2024 Aug 6.
6
Flexible Self-Powered Low-Decibel Voice Recognition Mask.柔性自供电低分贝语音识别口罩
Sensors (Basel). 2024 May 9;24(10):3007. doi: 10.3390/s24103007.
7
Online speech synthesis using a chronically implanted brain-computer interface in an individual with ALS.使用慢性植入脑-机接口对肌萎缩性侧索硬化症患者进行在线语音合成。
Sci Rep. 2024 Apr 26;14(1):9617. doi: 10.1038/s41598-024-60277-2.
8
Direct speech reconstruction from sensorimotor brain activity with optimized deep learning models.基于优化深度学习模型的感觉运动脑活动的直接语音重建。
J Neural Eng. 2023 Sep 20;20(5):056010. doi: 10.1088/1741-2552/ace8be.
9
Machine-Learning Methods for Speech and Handwriting Detection Using Neural Signals: A Review.基于神经信号的语音和手写检测的机器学习方法:综述。
Sensors (Basel). 2023 Jun 14;23(12):5575. doi: 10.3390/s23125575.
10
Linguistic representation of vowels in speech imagery EEG.言语意象脑电图中元音的语言表征。
Front Hum Neurosci. 2023 May 18;17:1163578. doi: 10.3389/fnhum.2023.1163578. eCollection 2023.
人类运动皮层中语音的听觉表征。
Elife. 2016 Mar 4;5:e12577. doi: 10.7554/eLife.12577.
4
Investigating deep learning for fNIRS based BCI.研究基于功能近红外光谱技术(fNIRS)的脑机接口的深度学习方法。
Annu Int Conf IEEE Eng Med Biol Soc. 2015 Aug;2015:2844-7. doi: 10.1109/EMBC.2015.7318984.
5
Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing.对语音的低频皮层夹带反映音素水平的加工。
Curr Biol. 2015 Oct 5;25(19):2457-65. doi: 10.1016/j.cub.2015.08.030. Epub 2015 Sep 24.
6
Brain-to-text: decoding spoken phrases from phone representations in the brain.脑到文本:从大脑中的语音表征解码口语短语。
Front Neurosci. 2015 Jun 12;9:217. doi: 10.3389/fnins.2015.00217. eCollection 2015.
7
Electrocorticographic representations of segmental features in continuous speech.连续言语中节段特征的皮质脑电图表现。
Front Hum Neurosci. 2015 Feb 24;9:97. doi: 10.3389/fnhum.2015.00097. eCollection 2015.
8
Neural decoding of spoken vowels from human sensory-motor cortex with high-density electrocorticography.利用高密度皮层脑电图从人类感觉运动皮层对语音元音进行神经解码。
Annu Int Conf IEEE Eng Med Biol Soc. 2014;2014:6782-5. doi: 10.1109/EMBC.2014.6945185.
9
Decoding spectrotemporal features of overt and covert speech from the human cortex.从人类大脑皮层解码公开和隐蔽言语的频谱时间特征。
Front Neuroeng. 2014 May 27;7:14. doi: 10.3389/fneng.2014.00014. eCollection 2014.
10
Direct classification of all American English phonemes using signals from functional speech motor cortex.利用功能性言语运动皮层的信号对所有美式英语音素进行直接分类。
J Neural Eng. 2014 Jun;11(3):035015. doi: 10.1088/1741-2560/11/3/035015. Epub 2014 May 19.