Suppr超能文献

使用基于 theta 相位进动的相对奖励的动态 Q 学习方法增强海马体空间解码。

Enhancement of Hippocampal Spatial Decoding Using a Dynamic Q-Learning Method With a Relative Reward Using Theta Phase Precession.

机构信息

Department of Biomedical Engineering, National Yang Ming University, No. 155, Section 2, Linong Street, Taipei 11221, Taiwan.

Department of Mechanical Engineering, National Cheng Kung University, No. 1 University Road, Tainan 70101, Taiwan.

出版信息

Int J Neural Syst. 2020 Sep;30(9):2050048. doi: 10.1142/S0129065720500483. Epub 2020 Aug 12.

Abstract

Hippocampal place cells and interneurons in mammals have stable place fields and theta phase precession profiles that encode spatial environmental information. Hippocampal CA1 neurons can represent the animal's location and prospective information about the goal location. Reinforcement learning (RL) algorithms such as Q-learning have been used to build the navigation models. However, the Q-learning ([Formula: see text]Q-learning) limits the reward function once the animals arrive at the goal location, leading to unsatisfactory location accuracy and convergence rates. Therefore, we proposed a revised version of the Q-learning algorithm, Q-learning ([Formula: see text]Q-learning), which assigns the reward function adaptively to improve the decoding performance. Firing rate was the input of the neural network of [Formula: see text]Q-learning and was used to predict the movement direction. On the other hand, phase precession was the input of the reward function to update the weights of [Formula: see text]Q-learning. Trajectory predictions using [Formula: see text]Q- and [Formula: see text]Q-learning were compared by the root mean squared error (RMSE) between the actual and predicted rat trajectories. Using [Formula: see text]Q-learning, significantly higher prediction accuracy and faster convergence rate were obtained compared with [Formula: see text]Q-learning in all cell types. Moreover, combining place cells and interneurons with theta phase precession improved the convergence rate and prediction accuracy. The proposed [Formula: see text]Q-learning algorithm is a quick and more accurate method to perform trajectory reconstruction and prediction.

摘要

哺乳动物的海马体位置细胞和中间神经元具有稳定的位置场和θ相进动分布,这些分布可以编码空间环境信息。海马体 CA1 神经元可以代表动物的位置和对目标位置的预期信息。强化学习 (RL) 算法,如 Q-learning,已被用于构建导航模型。然而,Q-learning([Formula: see text]Q-learning) 限制了一旦动物到达目标位置后的奖励函数,导致位置精度和收敛率不理想。因此,我们提出了 Q-learning 算法的修正版本,即 Q-learning([Formula: see text]Q-learning),该算法自适应地分配奖励函数,以提高解码性能。[Formula: see text]Q-learning 的输入是神经网络的放电率,用于预测运动方向。另一方面,相位进动是奖励函数的输入,用于更新 [Formula: see text]Q-learning 的权重。使用均方根误差 (RMSE) 比较 [Formula: see text]Q-和 [Formula: see text]Q-learning 的轨迹预测,即实际轨迹和预测大鼠轨迹之间的误差。与 [Formula: see text]Q-learning 相比,[Formula: see text]Q-learning 在所有细胞类型中都获得了更高的预测精度和更快的收敛率。此外,结合位置细胞和具有θ相进动的中间神经元可以提高收敛率和预测精度。所提出的 [Formula: see text]Q-learning 算法是一种快速且更准确的轨迹重构和预测方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验