Suppr超能文献

利用随机释放可塑性进行奖励优化学习。

Reward-optimizing learning using stochastic release plasticity.

作者信息

Sun Yuhao, Liao Wantong, Li Jinhao, Zhang Xinche, Wang Guan, Ma Zhiyuan, Song Sen

机构信息

Laboratory of Brain and Intelligence, Tsinghua University, Beijing, China.

School of Biomedical Engineering, Tsinghua University, Beijing, China.

出版信息

Front Neural Circuits. 2025 Aug 14;19:1618506. doi: 10.3389/fncir.2025.1618506. eCollection 2025.

Abstract

Synaptic plasticity underlies adaptive learning in neural systems, offering a biologically plausible framework for reward-driven learning. However, a question remains: how can plasticity rules achieve robustness and effectiveness comparable to error backpropagation? In this study, we introduce Reward-Optimized Stochastic Release Plasticity (RSRP), a learning framework where synaptic release is modeled as a parameterized distribution. Utilizing natural gradient estimation, we derive a synaptic plasticity learning rule that effectively adapts to maximize reward signals. Our approach achieves competitive performance and demonstrates stability in reinforcement learning, comparable to Proximal Policy Optimization (PPO), while attaining accuracy comparable with error backpropagation in digit classification. Additionally, we identify reward regularization as a key stabilizing mechanism and validate our method in biologically plausible networks. Our findings suggest that RSRP offers a robust and effective plasticity learning rule, especially in a discontinuous reinforcement learning paradigm, with potential implications for both artificial intelligence and experimental neuroscience.

摘要

突触可塑性是神经系统适应性学习的基础,为奖励驱动学习提供了一个生物学上合理的框架。然而,一个问题仍然存在:可塑性规则如何实现与误差反向传播相当的鲁棒性和有效性?在本研究中,我们引入了奖励优化随机释放可塑性(RSRP),这是一种学习框架,其中突触释放被建模为参数化分布。利用自然梯度估计,我们推导出一种突触可塑性学习规则,该规则能有效适应以最大化奖励信号。我们的方法实现了具有竞争力的性能,并在强化学习中表现出稳定性,与近端策略优化(PPO)相当,同时在数字分类中达到了与误差反向传播相当的准确率。此外,我们确定奖励正则化是一种关键的稳定机制,并在生物学上合理的网络中验证了我们的方法。我们的研究结果表明,RSRP提供了一种鲁棒且有效的可塑性学习规则,特别是在不连续强化学习范式中,对人工智能和实验神经科学都有潜在影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4fdf/12390965/045de863df4a/fncir-19-1618506-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验