• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于侵入式脑机接口的语音解码中波形重建的改进评估

Improved evaluation of waveform reconstruction in speech decoding based on invasive brain-computer interfaces.

作者信息

Wu Xiaolong, Hu Kejia, Fu Zhichun, Zhang Dingguo

机构信息

Department of Electronic and Electrical Engineering, University of Bath, Bath, United Kingdom.

Department of Neurosurgery, Center for Functional Neurosurgery, Ruijin Hospital Affiliated with Shanghai Jiao Tong University School of Medicine, Shanghai, China.

出版信息

Imaging Neurosci (Camb). 2025 Sep 10;3. doi: 10.1162/IMAG.a.146. eCollection 2025.

DOI:10.1162/IMAG.a.146
PMID:40959704
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12434379/
Abstract

Brain-computer interfaces (BCIs) that reconstruct speech waveforms from neural signals are a promising communication technology. However, the field lacks a standardized evaluation metric, making it difficult to compare results across studies. Existing objective metrics, such as correlation coefficient (CC) and mel cepstral distortion (MCD), are often used inconsistently and have intrinsic limitations. This study addresses the critical need for a robust and validated method for evaluating reconstructed waveform quality. Literature about waveform reconstruction from intracranial signals is reviewed, and issues with evaluation methods are presented. We collated reconstructed audio from 10 published speech BCI studies and collected Mean Opinion Scores (MOS) from human raters to serve as a perceptual ground truth. We then systematically evaluated how well combinations of existing objective metrics (STOI and MCD) could predict these MOS scores. To ensure robustness and generalizability, we employed a rigorous leave-one-dataset-out cross-validation scheme and compared multiple models, including linear and non-linear regressors. This work, for the first time, identifies a lack of a standard evaluation method, which prohibits cross-study comparison. Using 10 public datasets, our analysis reveals that a non-linear model, specifically a Random Forest regressor, provides the most accurate and reliable prediction of subjective MOS ratings (R² = 0.892). We propose this cross-validated Random Forest model, which maps STOI and MCD to a predicted MOS score, as a standardized objective evaluation metric for the speech BCI field. Its demonstrated accuracy and robust validation outperform the available methods. Moreover, it can provide the community with a reliable tool to benchmark performance, facilitate meaningful cross-study comparisons for the first time, and accelerate progress in speech neuroprosthetics.

摘要

能够从神经信号中重建语音波形的脑机接口(BCI)是一种很有前景的通信技术。然而,该领域缺乏标准化的评估指标,这使得跨研究比较结果变得困难。现有的客观指标,如相关系数(CC)和梅尔倒谱失真(MCD),使用时常常不一致且存在内在局限性。本研究满足了对一种用于评估重建波形质量的稳健且经过验证的方法的迫切需求。回顾了关于从颅内信号进行波形重建的文献,并指出了评估方法存在的问题。我们整理了来自10项已发表的语音脑机接口研究的重建音频,并从人类评分者那里收集了平均意见得分(MOS),以作为感知的基本事实。然后,我们系统地评估了现有客观指标(短时客观可懂度测量(STOI)和MCD)的组合对这些MOS得分的预测能力。为确保稳健性和通用性,我们采用了严格的留一数据集交叉验证方案,并比较了包括线性和非线性回归器在内的多个模型。这项工作首次发现缺乏标准评估方法阻碍了跨研究比较。使用10个公共数据集,我们的分析表明,非线性模型,特别是随机森林回归器,能最准确可靠地预测主观MOS评分(R² = 0.892)。我们提出这种经过交叉验证的随机森林模型,它将STOI和MCD映射到预测的MOS得分,作为语音脑机接口领域的标准化客观评估指标。其已证明的准确性和稳健验证优于现有方法。此外,它可以为该领域提供一个可靠的性能基准工具,首次促进有意义的跨研究比较,并加速语音神经假体的进展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d85/12434379/8cce0d3e12cc/IMAG.a.146_fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d85/12434379/64beab869111/IMAG.a.146_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d85/12434379/5e6e1474d617/IMAG.a.146_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d85/12434379/0fb683965e14/IMAG.a.146_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d85/12434379/8cce0d3e12cc/IMAG.a.146_fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d85/12434379/64beab869111/IMAG.a.146_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d85/12434379/5e6e1474d617/IMAG.a.146_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d85/12434379/0fb683965e14/IMAG.a.146_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8d85/12434379/8cce0d3e12cc/IMAG.a.146_fig8.jpg

相似文献

1
Improved evaluation of waveform reconstruction in speech decoding based on invasive brain-computer interfaces.基于侵入式脑机接口的语音解码中波形重建的改进评估
Imaging Neurosci (Camb). 2025 Sep 10;3. doi: 10.1162/IMAG.a.146. eCollection 2025.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
The quantity, quality and findings of network meta-analyses evaluating the effectiveness of GLP-1 RAs for weight loss: a scoping review.评估胰高血糖素样肽-1受体激动剂(GLP-1 RAs)减肥效果的网状Meta分析的数量、质量及结果:一项范围综述
Health Technol Assess. 2025 Jun 25:1-73. doi: 10.3310/SKHT8119.
4
The agreement of phonetic transcriptions between paediatric speech and language therapists transcribing a disordered speech sample.儿科言语和语言治疗师转写语音样本的音标转录的一致性。
Int J Lang Commun Disord. 2024 Sep-Oct;59(5):1981-1995. doi: 10.1111/1460-6984.13043. Epub 2024 Jun 8.
5
Aspects of Genetic Diversity, Host Specificity and Public Health Significance of Single-Celled Intestinal Parasites Commonly Observed in Humans and Mostly Referred to as 'Non-Pathogenic'.人类常见且大多被称为“非致病性”的单细胞肠道寄生虫的遗传多样性、宿主特异性及公共卫生意义
APMIS. 2025 Sep;133(9):e70036. doi: 10.1111/apm.70036.
6
Post-pandemic planning for maternity care for local, regional, and national maternity systems across the four nations: a mixed-methods study.针对四个地区的地方、区域和国家孕产妇保健系统的疫情后规划:一项混合方法研究。
Health Soc Care Deliv Res. 2025 Sep;13(35):1-25. doi: 10.3310/HHTE6611.
7
Don't put words in my mouth: speech perception can falsely activate a brain-computer interface.别把话强加于我:言语感知可能会错误地激活脑机接口。
J Neuroeng Rehabil. 2025 Aug 19;22(1):181. doi: 10.1186/s12984-025-01689-7.
8
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
9
A systematic review of speech, language and communication interventions for children with Down syndrome from 0 to 6 years.对0至6岁唐氏综合征儿童言语、语言和沟通干预措施的系统评价。
Int J Lang Commun Disord. 2022 Mar;57(2):441-463. doi: 10.1111/1460-6984.12699. Epub 2022 Feb 22.
10
Public Perception of the Brain-Computer Interface Based on a Decade of Data on X: Mixed Methods Study.基于X平台十年数据的公众对脑机接口的认知:混合方法研究
JMIR Form Res. 2025 Jun 25;9:e60859. doi: 10.2196/60859.

本文引用的文献

1
A brain-to-text framework for decoding natural tonal sentences.一种用于解码自然声调句子的脑到文本框架。
Cell Rep. 2024 Nov 26;43(11):114924. doi: 10.1016/j.celrep.2024.114924. Epub 2024 Oct 31.
2
An Accurate and Rapidly Calibrating Speech Neuroprosthesis.一种精确且快速校准的语音神经假体。
N Engl J Med. 2024 Aug 15;391(7):609-618. doi: 10.1056/NEJMoa2314132.
3
A Review of Motor Brain-Computer Interfaces Using Intracranial Electroencephalography Based on Surface Electrodes and Depth Electrodes.基于表面电极和深部电极的颅内脑电图的运动脑-机接口综述
IEEE Trans Neural Syst Rehabil Eng. 2024;32:2408-2431. doi: 10.1109/TNSRE.2024.3421551. Epub 2024 Jul 4.
4
Speech decoding from stereo-electroencephalography (sEEG) signals using advanced deep learning methods.使用先进的深度学习方法从立体脑电图(sEEG)信号中进行语音解码。
J Neural Eng. 2024 Jun 27;21(3). doi: 10.1088/1741-2552/ad593a.
5
The speech neuroprosthesis.言语神经假体。
Nat Rev Neurosci. 2024 Jul;25(7):473-492. doi: 10.1038/s41583-024-00819-9. Epub 2024 May 14.
6
Online speech synthesis using a chronically implanted brain-computer interface in an individual with ALS.使用慢性植入脑-机接口对肌萎缩性侧索硬化症患者进行在线语音合成。
Sci Rep. 2024 Apr 26;14(1):9617. doi: 10.1038/s41598-024-60277-2.
7
High-resolution neural recordings improve the accuracy of speech decoding.高分辨率神经记录提高了语音解码的准确性。
Nat Commun. 2023 Nov 6;14(1):6938. doi: 10.1038/s41467-023-42555-1.
8
Stable Decoding from a Speech BCI Enables Control for an Individual with ALS without Recalibration for 3 Months.稳定解码语音脑机接口可使 ALS 患者无需重新校准即可进行 3 个月的控制。
Adv Sci (Weinh). 2023 Dec;10(35):e2304853. doi: 10.1002/advs.202304853. Epub 2023 Oct 24.
9
A high-performance neuroprosthesis for speech decoding and avatar control.一种用于语音解码和化身控制的高性能神经假体。
Nature. 2023 Aug;620(7976):1037-1046. doi: 10.1038/s41586-023-06443-4. Epub 2023 Aug 23.
10
A high-performance speech neuroprosthesis.高性能言语神经假体
Nature. 2023 Aug;620(7976):1031-1036. doi: 10.1038/s41586-023-06377-x. Epub 2023 Aug 23.