基于数据驱动的机器学习模型，用于解码大脑反应中的语音分类。

Data-driven machine learning models for decoding speech categorization from evoked brain responses.

机构信息

Department of Electrical and Computer Engineering, University of Memphis, 3815 Central Avenue, Memphis, TN 38152, United States of America.

Institute for Intelligent Systems, University of Memphis, Memphis, TN, United States of America.

出版信息

J Neural Eng. 2021 Mar 23;18(4). doi: 10.1088/1741-2552/abecf0.

DOI:10.1088/1741-2552/abecf0

PMID:33690177

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8738965/

Abstract

Categorical perception (CP) of audio is critical to understand how the human brain perceives speech sounds despite widespread variability in acoustic properties. Here, we investigated the spatiotemporal characteristics of auditory neural activity that reflects CP for speech (i.e. differentiates phonetic prototypes from ambiguous speech sounds).We recorded 64-channel electroencephalograms as listeners rapidly classified vowel sounds along an acoustic-phonetic continuum. We used support vector machine classifiers and stability selection to determine when and where in the brain CP was best decoded across space and time via source-level analysis of the event-related potentials.. We found that early (120 ms) whole-brain data decoded speech categories (i.e. prototypical vs. ambiguous tokens) with 95.16% accuracy (area under the curve 95.14%;1-score 95.00%). Separate analyses on left hemisphere (LH) and right hemisphere (RH) responses showed that LH decoding was more accurate and earlier than RH (89.03% vs. 86.45% accuracy; 140 ms vs. 200 ms). Stability (feature) selection identified 13 regions of interest (ROIs) out of 68 brain regions [including auditory cortex, supramarginal gyrus, and inferior frontal gyrus (IFG)] that showed categorical representation during stimulus encoding (0-260 ms). In contrast, 15 ROIs (including fronto-parietal regions, IFG, motor cortex) were necessary to describe later decision stages (later 300-800 ms) of categorization but these areas were highly associated with the strength of listeners' categorical hearing (i.e. slope of behavioral identification functions).Our data-driven multivariate models demonstrate that abstract categories emerge surprisingly early (∼120 ms) in the time course of speech processing and are dominated by engagement of a relatively compact fronto-temporal-parietal brain network.

摘要

听觉范畴知觉（CP）对于理解大脑如何感知语音至关重要，尽管语音的声学特性存在广泛的可变性。在这里，我们研究了反映语音 CP 的听觉神经活动的时空特征（即区分语音原型和歧义语音）。我们记录了 64 通道脑电图，当听众沿着声学-语音连续体快速分类元音时，我们使用支持向量机分类器和稳定性选择来确定通过事件相关电位的源水平分析，在大脑中何时何地以最佳方式解码 CP 。我们发现，早期（120ms）全脑数据可以以 95.16%的准确率（曲线下面积 95.14%；1 分 95.00%）解码语音类别（即原型与歧义音位）。对左半球（LH）和右半球（RH）反应的单独分析表明，LH 解码比 RH 更准确且更早（89.03%比 86.45%的准确率；140ms 比 200ms）。稳定性（特征）选择从 68 个大脑区域中确定了 13 个感兴趣区域（ROI）[包括听觉皮层、缘上回和下额前回（IFG）]，这些区域在刺激编码期间（0-260ms）表现出范畴表示。相比之下，需要 15 个 ROI（包括额顶叶区域、IFG、运动皮层）来描述分类的后期决策阶段（300-800ms 后），但这些区域与听众分类听力的强度高度相关（即行为识别函数的斜率）。我们的数据驱动的多元模型表明，抽象类别在语音处理的时间进程中惊人地早（约 120ms）出现，并且主要由相对紧凑的额颞顶叶大脑网络的参与所主导。

相似文献

Data-driven machine learning models for decoding speech categorization from evoked brain responses.基于数据驱动的机器学习模型，用于解码大脑反应中的语音分类。

J Neural Eng. 2021 Mar 23;18(4). doi: 10.1088/1741-2552/abecf0.

Auditory cortex is susceptible to lexical influence as revealed by informational vs. energetic masking of speech categorization.听觉皮层易受词汇影响，这可以通过信息掩蔽与能量掩蔽对言语分类的影响来揭示。

Brain Res. 2021 May 15;1759:147385. doi: 10.1016/j.brainres.2021.147385. Epub 2021 Feb 23.

Speech categorization is better described by induced rather than evoked neural activity.诱发神经活动而非感知神经活动更能准确描述言语分类。

J Acoust Soc Am. 2021 Mar;149(3):1644. doi: 10.1121/10.0003572.

Functional benefits of continuous vs. categorical listening strategies on the neural encoding and perception of noise-degraded speech.连续与分类听力策略对噪声减损语音神经编码与感知的功能益处。

Brain Res. 2024 Dec 1;1844:149166. doi: 10.1016/j.brainres.2024.149166. Epub 2024 Aug 14.

Decoding Hearing-Related Changes in Older Adults' Spatiotemporal Neural Processing of Speech Using Machine Learning.使用机器学习解码老年人言语时空神经处理中与听力相关的变化

Front Neurosci. 2020 Jul 16;14:748. doi: 10.3389/fnins.2020.00748. eCollection 2020.

Perceptual warping exposes categorical representations for speech in human brainstem responses.感知扭曲揭示了人类脑干反应中语音的类别代表性。

Neuroimage. 2023 Apr 1;269:119899. doi: 10.1016/j.neuroimage.2023.119899. Epub 2023 Jan 28.

Decoding of single-trial EEG reveals unique states of functional brain connectivity that drive rapid speech categorization decisions.单次脑电活动的解码揭示了功能连接的独特状态，这些状态驱动了快速的言语分类决策。

J Neural Eng. 2020 Feb 5;17(1):016045. doi: 10.1088/1741-2552/ab6040.

Attentional modulation and domain-specificity underlying the neural organization of auditory categorical perception.听觉范畴知觉神经组织背后的注意调制与领域特异性

Eur J Neurosci. 2017 Mar;45(5):690-699. doi: 10.1111/ejn.13526. Epub 2017 Feb 10.

Amplified induced neural oscillatory activity predicts musicians' benefits in categorical speech perception.增强诱发的神经振荡活动可预测音乐家在分类语音感知方面的优势。

Neuroscience. 2017 Apr 21;348:107-113. doi: 10.1016/j.neuroscience.2017.02.015. Epub 2017 Feb 15.

Decoding four different sound-categories in the auditory cortex using functional near-infrared spectroscopy.使用功能近红外光谱技术在听觉皮层中解码四种不同的声音类别。

Hear Res. 2016 Mar;333:157-166. doi: 10.1016/j.heares.2016.01.009. Epub 2016 Jan 29.

引用本文的文献

Neural correlates of phonetic categorization under auditory (phoneme) and visual (grapheme) modalities.听觉（音素）和视觉（字素）模式下语音分类的神经关联。

Neuroscience. 2025 Jan 26;565:182-191. doi: 10.1016/j.neuroscience.2024.11.079. Epub 2024 Dec 2.

Continuous dynamics in behavior reveal interactions between perceptual warping in categorization and speech-in-noise perception.行为中的连续动态揭示了分类中的感知扭曲与噪声中语音感知之间的相互作用。

Front Neurosci. 2023 Mar 1;17:1032369. doi: 10.3389/fnins.2023.1032369. eCollection 2023.

Perceptual warping exposes categorical representations for speech in human brainstem responses.感知扭曲揭示了人类脑干反应中语音的类别代表性。

Neuroimage. 2023 Apr 1;269:119899. doi: 10.1016/j.neuroimage.2023.119899. Epub 2023 Jan 28.

Lexical Influences on Categorical Speech Perception Are Driven by a Temporoparietal Circuit.词汇对范畴性言语感知的影响是由颞顶叶回路驱动的。

J Cogn Neurosci. 2021 Apr 1;33(5):840-852. doi: 10.1162/jocn_a_01678.

Speech categorization is better described by induced rather than evoked neural activity.诱发神经活动而非感知神经活动更能准确描述言语分类。

J Acoust Soc Am. 2021 Mar;149(3):1644. doi: 10.1121/10.0003572.

本文引用的文献

Speech categorization is better described by induced rather than evoked neural activity.诱发神经活动而非感知神经活动更能准确描述言语分类。

J Acoust Soc Am. 2021 Mar;149(3):1644. doi: 10.1121/10.0003572.

Brain Res. 2021 May 15;1759:147385. doi: 10.1016/j.brainres.2021.147385. Epub 2021 Feb 23.

Decoding Hearing-Related Changes in Older Adults' Spatiotemporal Neural Processing of Speech Using Machine Learning.使用机器学习解码老年人言语时空神经处理中与听力相关的变化

Front Neurosci. 2020 Jul 16;14:748. doi: 10.3389/fnins.2020.00748. eCollection 2020.

Mapping language from MEG beta power modulations during auditory and visual naming.从听觉和视觉命名期间的 MEG β 功率调制中映射语言。

Neuroimage. 2020 Oct 15;220:117090. doi: 10.1016/j.neuroimage.2020.117090. Epub 2020 Jun 25.

Effects of Noise on the Behavioral and Neural Categorization of Speech.噪声对言语行为和神经分类的影响。

Front Neurosci. 2020 Feb 27;14:153. doi: 10.3389/fnins.2020.00153. eCollection 2020.

Early lexical influences on sublexical processing in speech perception: Evidence from electrophysiology.早期词汇对语音感知中次词汇加工的影响：来自电生理学的证据。

Cognition. 2020 Apr;197:104162. doi: 10.1016/j.cognition.2019.104162. Epub 2020 Jan 2.

Auditory categorical processing for speech is modulated by inherent musical listening skills.听觉范畴处理言语受到固有音乐聆听技能的调节。

Neuroreport. 2020 Jan 27;31(2):162-166. doi: 10.1097/WNR.0000000000001369.

J Neural Eng. 2020 Feb 5;17(1):016045. doi: 10.1088/1741-2552/ab6040.

Plasticity in auditory categorization is supported by differential engagement of the auditory-linguistic network.听觉分类中的可塑性由听觉-语言网络的差异参与支持。

Neuroimage. 2019 Nov 1;201:116022. doi: 10.1016/j.neuroimage.2019.116022. Epub 2019 Jul 13.

The time-course of cortical responses to speech revealed by fast optical imaging.快速光学成像揭示的皮质对言语反应的时间进程。

Brain Lang. 2018 Sep;184:32-42. doi: 10.1016/j.bandl.2018.06.006. Epub 2018 Jun 27.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于数据驱动的机器学习模型，用于解码大脑反应中的语音分类。

Data-driven machine learning models for decoding speech categorization from evoked brain responses.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献