Hanson Catherine, Caglar Leyla Roskan, Hanson Stephen José
Rutgers Brain Imaging Center, Newark, NJ, United States.
RUBIC and Psychology Department and Center for Molecular and Behavioral Neuroscience, Rutgers University-Newark, Newark, NJ, United States.
Front Psychol. 2018 Apr 13;9:374. doi: 10.3389/fpsyg.2018.00374. eCollection 2018.
Category learning performance is influenced by both the nature of the category's structure and the way category features are processed during learning. Shepard (1964, 1987) showed that stimuli can have structures with features that are statistically uncorrelated (separable) or statistically correlated (integral) within categories. Humans find it much easier to learn categories having separable features, especially when attention to only a subset of relevant features is required, and harder to learn categories having integral features, which require consideration of all of the available features and integration of all the relevant category features satisfying the category rule (Garner, 1974). In contrast to humans, a single hidden layer backpropagation (BP) neural network has been shown to learn both separable and integral categories equally easily, independent of the category rule (Kruschke, 1993). This "failure" to replicate human category performance appeared to be strong evidence that connectionist networks were incapable of modeling human attentional bias. We tested the presumed limitations of attentional bias in networks in two ways: (1) by having networks learn categories with exemplars that have high feature complexity in contrast to the low dimensional stimuli previously used, and (2) by investigating whether a Deep Learning (DL) network, which has demonstrated humanlike performance in many different kinds of tasks (language translation, autonomous driving, etc.), would display human-like attentional bias during category learning. We were able to show a number of interesting results. First, we replicated the failure of BP to differentially process integral and separable category structures when low dimensional stimuli are used (Garner, 1974; Kruschke, 1993). Second, we show that using the same low dimensional stimuli, Deep Learning (DL), unlike BP but similar to humans, learns separable category structures more quickly than integral category structures. Third, we show that even BP can exhibit human like learning differences between integral and separable category structures when high dimensional stimuli (face exemplars) are used. We conclude, after visualizing the hidden unit representations, that DL appears to extend initial learning due to feature development thereby reducing destructive feature competition by incrementally refining feature detectors throughout later layers until a tipping point (in terms of error) is reached resulting in rapid asymptotic learning.
类别学习表现受到类别结构的性质以及学习过程中类别特征的处理方式的影响。谢泼德(1964年,1987年)表明,刺激可以具有在类别内特征在统计上不相关(可分离)或统计相关(整体)的结构。人类发现学习具有可分离特征的类别要容易得多,特别是当只需要关注相关特征的一个子集时,而学习具有整体特征的类别则更难,这需要考虑所有可用特征并整合满足类别规则的所有相关类别特征(加纳,1974年)。与人类不同,已证明单个隐藏层反向传播(BP)神经网络能够同样轻松地学习可分离和整体类别,而与类别规则无关(克鲁施克,1993年)。这种无法复制人类类别表现的情况似乎有力地证明了联结主义网络无法模拟人类的注意力偏差。我们通过两种方式测试了网络中注意力偏差的假定局限性:(1)让网络学习具有高特征复杂性示例的类别,这与之前使用的低维刺激形成对比;(2)研究在许多不同类型任务(语言翻译、自动驾驶等)中已表现出类人性能的深度学习(DL)网络在类别学习过程中是否会表现出类人的注意力偏差。我们能够展示出一些有趣的结果。首先,当使用低维刺激时,我们复制了BP在差异处理整体和可分离类别结构方面的失败(加纳,1974年;克鲁施克,1993年)。其次,我们表明,使用相同的低维刺激,深度学习(DL)与BP不同但与人类相似,学习可分离类别结构比学习整体类别结构更快。第三,我们表明,当使用高维刺激(面部示例)时,即使是BP在整体和可分离类别结构之间也能表现出类人的学习差异。在可视化隐藏单元表示之后,我们得出结论,DL似乎由于特征发展而扩展了初始学习,从而通过在后续层中逐步细化特征检测器来减少破坏性特征竞争,直到达到一个临界点(就误差而言),从而导致快速的渐近学习。