Arleo Angelo, Smeraldi Fabrizio, Gerstner Wulfram
Neuroscience Group, SONY Computer Science Laboratory, 75005 Paris, France.
IEEE Trans Neural Netw. 2004 May;15(3):639-52. doi: 10.1109/TNN.2004.826221.
We study spatial learning and navigation for autonomous agents. A state space representation is constructed by unsupervised Hebbian learning during exploration. As a result of learning, a representation of the continuous two-dimensional (2-D) manifold in the high-dimensional input space is found. The representation consists of a population of localized overlapping place fields covering the 2-D space densely and uniformly. This space coding is comparable to the representation provided by hippocampal place cells in rats. Place fields are learned by extracting spatio-temporal properties of the environment from sensory inputs. The visual scene is modeled using the responses of modified Gabor filters placed at the nodes of a sparse Log-polar graph. Visual sensory aliasing is eliminated by taking into account self-motion signals via path integration. This solves the hidden state problem and provides a suitable representation for applying reinforcement learning in continuous space for action selection. A temporal-difference prediction scheme is used to learn sensorimotor mappings to perform goal-oriented navigation. Population vector coding is employed to interpret ensemble neural activity. The model is validated on a mobile Khepera miniature robot.
我们研究自主智能体的空间学习与导航。在探索过程中,通过无监督赫布学习构建状态空间表示。学习的结果是在高维输入空间中找到连续二维(2-D)流形的一种表示。该表示由一群局部重叠的位置场组成,这些位置场密集且均匀地覆盖二维空间。这种空间编码类似于大鼠海马体位置细胞所提供的表示。通过从感官输入中提取环境的时空特性来学习位置场。使用放置在稀疏对数极坐标图节点处的改进型伽柏滤波器的响应来对视觉场景进行建模。通过路径积分考虑自身运动信号来消除视觉感官混叠。这解决了隐藏状态问题,并为在连续空间中应用强化学习进行动作选择提供了合适的表示。使用时间差分预测方案来学习感觉运动映射以执行目标导向导航。采用群体向量编码来解释群体神经活动。该模型在移动的Khepera微型机器人上得到验证。