Jaffe Paul I, Santiago-Reyes Gustavo X, Schafer Robert J, Bissett Patrick G, Poldrack Russell A
Department of Psychology, Stanford University, Stanford, United States.
Department of Bioengineering, Stanford University, Stanford, United States.
Elife. 2025 Feb 28;13:RP98351. doi: 10.7554/eLife.98351.
Evidence accumulation models (EAMs) are the dominant framework for modeling response time (RT) data from speeded decision-making tasks. While providing a good quantitative description of RT data in terms of abstract perceptual representations, EAMs do not explain how the visual system extracts these representations in the first place. To address this limitation, we introduce the visual accumulator model (VAM), in which convolutional neural network models of visual processing and traditional EAMs are jointly fitted to trial-level RTs and raw (pixel-space) visual stimuli from individual subjects in a unified Bayesian framework. Models fitted to large-scale cognitive training data from a stylized flanker task captured individual differences in congruency effects, RTs, and accuracy. We find evidence that the selection of task-relevant information occurs through the orthogonalization of relevant and irrelevant representations, demonstrating how our framework can be used to relate visual representations to behavioral outputs. Together, our work provides a probabilistic framework for both constraining neural network models of vision with behavioral data and studying how the visual system extracts representations that guide decisions.
证据积累模型(EAMs)是对快速决策任务中的反应时间(RT)数据进行建模的主导框架。虽然EAMs在抽象感知表征方面对RT数据提供了很好的定量描述,但它们首先并没有解释视觉系统是如何提取这些表征的。为了解决这一局限性,我们引入了视觉累加器模型(VAM),在该模型中,视觉处理的卷积神经网络模型和传统的EAMs在一个统一的贝叶斯框架中联合拟合个体受试者的试验级RT和原始(像素空间)视觉刺激。拟合来自一个程式化侧翼任务的大规模认知训练数据的模型捕捉了一致性效应、RT和准确性方面的个体差异。我们发现有证据表明,任务相关信息的选择是通过相关和不相关表征的正交化来实现的,这表明了我们的框架如何能够用于将视觉表征与行为输出联系起来。总之,我们的工作为用行为数据约束视觉神经网络模型以及研究视觉系统如何提取指导决策的表征提供了一个概率框架。