Rajalingham Rishi, Schmidt Kailyn, DiCarlo James J
Department of Brain and Cognitive Sciences and.
McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139.
J Neurosci. 2015 Sep 2;35(35):12127-36. doi: 10.1523/JNEUROSCI.0573-15.2015.
Although the rhesus monkey is used widely as an animal model of human visual processing, it is not known whether invariant visual object recognition behavior is quantitatively comparable across monkeys and humans. To address this question, we systematically compared the core object recognition behavior of two monkeys with that of human subjects. To test true object recognition behavior (rather than image matching), we generated several thousand naturalistic synthetic images of 24 basic-level objects with high variation in viewing parameters and image background. Monkeys were trained to perform binary object recognition tasks on a match-to-sample paradigm. Data from 605 human subjects performing the same tasks on Mechanical Turk were aggregated to characterize "pooled human" object recognition behavior, as well as 33 separate Mechanical Turk subjects to characterize individual human subject behavior. Our results show that monkeys learn each new object in a few days, after which they not only match mean human performance but show a pattern of object confusion that is highly correlated with pooled human confusion patterns and is statistically indistinguishable from individual human subjects. Importantly, this shared human and monkey pattern of 3D object confusion is not shared with low-level visual representations (pixels, V1+; models of the retina and primary visual cortex) but is shared with a state-of-the-art computer vision feature representation. Together, these results are consistent with the hypothesis that rhesus monkeys and humans share a common neural shape representation that directly supports object perception.
To date, several mammalian species have shown promise as animal models for studying the neural mechanisms underlying high-level visual processing in humans. In light of this diversity, making tight comparisons between nonhuman and human primates is particularly critical in determining the best use of nonhuman primates to further the goal of the field of translating knowledge gained from animal models to humans. To the best of our knowledge, this study is the first systematic attempt at comparing a high-level visual behavior of humans and macaque monkeys.
尽管恒河猴被广泛用作人类视觉处理的动物模型,但尚不清楚不变的视觉对象识别行为在猴子和人类之间是否在数量上具有可比性。为了解决这个问题,我们系统地比较了两只猴子与人类受试者的核心对象识别行为。为了测试真正的对象识别行为(而不是图像匹配),我们生成了数千张24个基本层级对象的自然合成图像,这些图像在观察参数和图像背景方面具有高度变化。猴子被训练在匹配样本范式上执行二元对象识别任务。汇总了605名在亚马逊土耳其机器人平台上执行相同任务的人类受试者的数据,以表征“总体人类”对象识别行为,以及33名单独的亚马逊土耳其机器人平台受试者的数据,以表征个体人类受试者行为。我们的结果表明,猴子在几天内就能学会每个新对象,之后它们不仅能达到人类的平均表现,还表现出一种对象混淆模式,这种模式与总体人类混淆模式高度相关,并且在统计学上与个体人类受试者没有区别。重要的是,这种人类和猴子共有的3D对象混淆模式与低级视觉表征(像素、V1+;视网膜和初级视觉皮层模型)不同,但与一种先进的计算机视觉特征表征相同。总之,这些结果与恒河猴和人类共享一种直接支持对象感知的共同神经形状表征这一假设一致。
迄今为止,几种哺乳动物物种已显示出有望作为研究人类高级视觉处理潜在神经机制的动物模型。鉴于这种多样性,在确定如何最好地利用非人类灵长类动物以推进将从动物模型中获得的知识转化为人类应用这一领域目标时,对非人类和人类灵长类动物进行严格比较尤为关键。据我们所知,本研究是首次系统尝试比较人类和猕猴的高级视觉行为。