探索汉语句子阅读过程中从上下文嵌入计算出的特征与脑电频段功率之间的关系。

Exploring the relationship between features calculated from contextual embeddings and EEG band power during sentence reading in Chinese.

作者信息

Wang Yao, Xue Tiantian, Yang Xingyu

机构信息

Cognitive Science and Allied Health School, Beijing Language and Culture University, Beijing, China.

Institute of Life and Health Sciences, Beijing Language and Culture University, Beijing, China.

出版信息

Front Neurosci. 2025 Jul 30;19:1656519. doi: 10.3389/fnins.2025.1656519. eCollection 2025.

DOI:10.3389/fnins.2025.1656519

PMID:40809397

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12343585/

Abstract

INTRODUCTION

Contextual embeddings-a core component of large language models (LLMs) that generate dynamic vector representations capturing words' semantic properties-have demonstrated structural similarities to brain activity patterns at the single-word level. This alignment supports the theoretical framework proposing vector-based neural coding for natural language processing in the brain, where linguistic units may be represented as context-sensitive vectors analogous to LLM-derived embeddings. Building on this framework, we hypothesize that cumulative distance metrics between contextual embeddings of adjacent linguistic units (words/Chinese characters) in sentence contexts may quantitatively reflect neural activation intensity during reading comprehension.

METHODS

Using large-scale EEG datasets collected during reading tasks, we systematically investigated the relationship between these computationally derived distance features and frequency-specific band power measures associated with neural activity.

RESULTS

In conclusion, gamma-band power exhibited associations with various NLP features in the ChineseEEG dataset, whereas no comparable gamma-specific effects were observed in the ZuCo1.0 dataset. Additionally, significant effects were found in other frequency bands for both datasets.

DISCUSSION

The mixed yet intriguing results invite a deeper discussion of the directional associations (positive/negative) observed in Gamma and other frequency bands, their cognitive implications, and the potential influence of textual characteristics on these findings. While observed effects may be somehow text- or dataset- dependent, our analyses revealed associations between various distance metrics and neural responses, consistent with predictions derived from the vector-based neural coding framework.

摘要

引言

上下文嵌入——大语言模型（LLMs）的核心组成部分，它生成捕捉单词语义属性的动态向量表示——已在单字层面展现出与大脑活动模式的结构相似性。这种一致性支持了为大脑中的自然语言处理提出基于向量的神经编码的理论框架，在该框架中，语言单元可表示为类似于基于大语言模型得出的嵌入的上下文敏感向量。基于此框架，我们假设句子语境中相邻语言单元（单词/汉字）的上下文嵌入之间的累积距离度量可能定量反映阅读理解过程中的神经激活强度。

方法

利用在阅读任务期间收集的大规模脑电图数据集，我们系统地研究了这些通过计算得出的距离特征与与神经活动相关的特定频率带功率测量值之间的关系。

结果

总之，在中文脑电图数据集中，伽马波段功率与各种自然语言处理特征存在关联，而在ZuCo1.0数据集中未观察到类似的特定于伽马的效应。此外，在两个数据集的其他频段也发现了显著效应。

讨论

这些复杂而有趣的结果引发了对在伽马和其他频段观察到的方向性关联（正/负）、它们的认知意义以及文本特征对这些发现的潜在影响的更深入讨论。虽然观察到的效应可能在某种程度上依赖于文本或数据集，但我们的分析揭示了各种距离度量与神经反应之间的关联，这与基于向量的神经编码框架得出的预测一致。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

探索汉语句子阅读过程中从上下文嵌入计算出的特征与脑电频段功率之间的关系。

Exploring the relationship between features calculated from contextual embeddings and EEG band power during sentence reading in Chinese.

作者信息

机构信息

出版信息

INTRODUCTION

METHODS

RESULTS

DISCUSSION

引言

方法

结果

讨论

相似文献

本文引用的文献

探索汉语句子阅读过程中从上下文嵌入计算出的特征与脑电频段功率之间的关系。

Exploring the relationship between features calculated from contextual embeddings and EEG band power during sentence reading in Chinese.

作者信息

机构信息

出版信息

INTRODUCTION

METHODS

RESULTS

DISCUSSION

引言

方法

结果

讨论

相似文献

本文引用的文献