Meier Florian, Dang-Nhu Raphaël, Steger Angelika
Department of Computer Science, ETH Zürich, Zurich, Switzerland.
Front Comput Neurosci. 2020 Feb 18;14:12. doi: 10.3389/fncom.2020.00012. eCollection 2020.
Natural brains perform miraculously well in learning new tasks from a small number of samples, whereas sample efficient learning is still a major open problem in the field of machine learning. Here, we raise the question, how the neural coding scheme affects sample efficiency, and make first progress on this question by proposing and analyzing a learning algorithm that uses a simple reinforce-type plasticity mechanism and does not require any gradients to learn low dimensional mappings. It harnesses three bio-plausible mechanisms, namely, population codes with bell shaped tuning curves, continous attractor mechanisms and probabilistic synapses, to achieve sample efficient learning. We show both theoretically and by simulations that population codes with broadly tuned neurons lead to high sample efficiency, whereas codes with sharply tuned neurons account for high final precision. Moreover, a dynamic adaptation of the tuning width during learning gives rise to both, high sample efficiency and high final precision. We prove a sample efficiency guarantee for our algorithm that lies within a logarithmic factor from the information theoretical optimum. Our simulations show that for low dimensional mappings, our learning algorithm achieves comparable sample efficiency to multi-layer perceptrons trained by gradient descent, although it does not use any gradients. Furthermore, it achieves competitive sample efficiency in low dimensional reinforcement learning tasks. From a machine learning perspective, these findings may inspire novel approaches to improve sample efficiency. From a neuroscience perspective, these findings suggest sample efficiency as a yet unstudied functional role of adaptive tuning curve width.
自然大脑在从少量样本中学习新任务方面表现得极其出色,而样本高效学习仍是机器学习领域的一个主要开放性问题。在此,我们提出一个问题,即神经编码方案如何影响样本效率,并通过提出和分析一种学习算法在这个问题上取得了初步进展,该算法使用一种简单的强化型可塑性机制,且无需任何梯度来学习低维映射。它利用了三种具有生物学合理性的机制,即具有钟形调谐曲线的群体编码、连续吸引子机制和概率性突触,以实现样本高效学习。我们通过理论分析和模拟表明,具有广泛调谐神经元的群体编码会带来高样本效率,而具有尖锐调谐神经元的编码则能实现高最终精度。此外,在学习过程中对调谐宽度进行动态调整会同时产生高样本效率和高最终精度。我们证明了我们算法的样本效率保证,其与信息理论最优值相差一个对数因子。我们的模拟表明,对于低维映射,我们的学习算法虽然不使用任何梯度,但能实现与通过梯度下降训练的多层感知器相当的样本效率。此外,它在低维强化学习任务中实现了具有竞争力的样本效率。从机器学习的角度来看,这些发现可能会启发提高样本效率的新方法。从神经科学的角度来看,这些发现表明样本效率是自适应调谐曲线宽度尚未被研究的功能作用。