Alipour Abolfazl, Beggs John M, Brown Joshua W, James Thomas W
Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN USA.
Program in Neuroscience, Indiana University, Bloomington, IN USA.
Cogn Neurodyn. 2022 Feb;16(1):149-165. doi: 10.1007/s11571-021-09703-z. Epub 2021 Aug 10.
The two visual streams hypothesis is a robust example of neural functional specialization that has inspired countless studies over the past four decades. According to one prominent version of the theory, the fundamental goal of the dorsal visual pathway is the transformation of retinal information for visually-guided motor behavior. To that end, the dorsal stream processes input using absolute (or veridical) metrics only when the movement is initiated, necessitating very little, or no, memory. Conversely, because the ventral visual pathway does not involve motor behavior (its output does not influence the real world), the ventral stream processes input using relative (or illusory) metrics and can accumulate or integrate sensory evidence over long time constants, which provides a substantial capacity for memory. In this study, we tested these relations between functional specialization, processing metrics, and memory by training identical recurrent neural networks to perform either a viewpoint-invariant object classification task or an orientation/size determination task. The former task relies on relative metrics, benefits from accumulating sensory evidence, and is usually attributed to the ventral stream. The latter task relies on absolute metrics, can be computed accurately in the moment, and is usually attributed to the dorsal stream. To quantify the amount of memory required for each task, we chose two types of neural network models. Using a long-short-term memory (LSTM) recurrent network, we found that viewpoint-invariant object categorization (object task) required a longer memory than orientation/size determination (orientation task). Additionally, to dissect this memory effect, we considered factors that contributed to longer memory in object tasks. First, we used two different sets of objects, one with self-occlusion of features and one without. Second, we defined object classes either strictly by visual feature similarity or (more liberally) by semantic label. The models required greater memory when features were self-occluded and when object classes were defined by visual feature similarity, showing that self-occlusion and visual similarity among object task samples are contributing to having a long memory. The same set of tasks modeled using modified leaky-integrator echo state recurrent networks (LiESN), however, did not replicate the results, except under some conditions. This may be because LiESNs cannot perform fine-grained memory adjustments due to their network-wide memory coefficient and fixed recurrent weights. In sum, the LSTM simulations suggest that longer memory is advantageous for performing viewpoint-invariant object classification (a putative ventral stream function) because it allows for interpolation of features across viewpoints. The results further suggest that orientation/size determination (a putative dorsal stream function) does not benefit from longer memory. These findings are consistent with the two visual streams theory of functional specialization.
The online version contains supplementary material available at 10.1007/s11571-021-09703-z.
双视觉流假说是神经功能特化的一个有力例证,在过去四十年中激发了无数研究。根据该理论的一个著名版本,背侧视觉通路的基本目标是将视网膜信息转化为视觉引导的运动行为。为此,背侧流仅在运动启动时使用绝对(或真实)度量来处理输入,几乎不需要或不需要记忆。相反,由于腹侧视觉通路不涉及运动行为(其输出不影响现实世界),腹侧流使用相对(或虚幻)度量来处理输入,并且可以在长时间常数上积累或整合感官证据,这提供了相当大的记忆能力。在本研究中,我们通过训练相同的递归神经网络来执行视点不变的物体分类任务或方向/大小确定任务,测试了功能特化、处理度量和记忆之间的这些关系。前一个任务依赖于相对度量,受益于积累感官证据,通常归因于腹侧流。后一个任务依赖于绝对度量,可以在当下准确计算,通常归因于背侧流。为了量化每个任务所需的记忆量,我们选择了两种类型的神经网络模型。使用长短期记忆(LSTM)递归网络,我们发现视点不变的物体分类(物体任务)比方向/大小确定(方向任务)需要更长的记忆。此外,为了剖析这种记忆效应,我们考虑了导致物体任务中记忆更长的因素。首先,我们使用了两组不同的物体,一组具有特征的自我遮挡,另一组没有。其次,我们要么严格根据视觉特征相似性,要么(更宽松地)根据语义标签来定义物体类别。当特征被自我遮挡以及当物体类别由视觉特征相似性定义时,模型需要更大的记忆,这表明物体任务样本中的自我遮挡和视觉相似性导致了长记忆。然而,使用修改后的泄漏积分器回声状态递归网络(LiESN)对同一组任务进行建模时,除了在某些条件下,没有复制出结果。这可能是因为LiESN由于其全网络记忆系数和固定的递归权重,无法进行细粒度的记忆调整。总之,LSTM模拟表明,更长的记忆有利于执行视点不变的物体分类(一种假定的腹侧流功能),因为它允许在不同视点之间插值特征。结果进一步表明,方向/大小确定(一种假定的背侧流功能)不会从更长的记忆中受益。这些发现与功能特化的双视觉流理论一致。
在线版本包含可在10.1007/s11571-021-09703-z获取的补充材料。