将元强化学习与神经可塑性机制相结合以提高人工智能性能。

Combining meta reinforcement learning with neural plasticity mechanisms for improved AI performance.

作者信息

Liu Liu, Xu Zhifei

机构信息

College of Business Administration, Capital University of Economics and Business, Beijing, China.

School of Science and Engineering, Chinese University of Hong Kong - Shenzhen, Shenzhen, Guangdong, China.

出版信息

PLoS One. 2025 May 15;20(5):e0320777. doi: 10.1371/journal.pone.0320777. eCollection 2025.

DOI:10.1371/journal.pone.0320777

PMID:40372990

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12080787/

Abstract

This research explores the potential of combining Meta Reinforcement Learning (MRL) with Spike-Timing-Dependent Plasticity (STDP) to enhance the performance and adaptability of AI agents in Atari game settings. Our methodology leverages MRL to swiftly adjust agent strategies across a range of games, while STDP fine-tunes synaptic weights based on neuronal spike timings, which in turn improves learning efficiency and decision-making under changing conditions. A series of experiments were conducted with standard Atari games to compare the hybrid MRL-STDP model against baseline models using traditional reinforcement learning techniques like Q-learning and Deep Q-Networks. Various performance metrics, including learning speed, adaptability, and cross-game generalization, were evaluated. The results show that the MRL-STDP approach significantly accelerates the agent's ability to reach competitive performance levels, with a 40% boost in learning efficiency and a 35% increase in adaptability over conventional models.

摘要

本研究探讨了将元强化学习（MRL）与尖峰时间依赖可塑性（STDP）相结合的潜力，以提高人工智能智能体在雅达利游戏环境中的性能和适应性。我们的方法利用MRL在一系列游戏中迅速调整智能体策略，而STDP则根据神经元尖峰时间微调突触权重，进而提高学习效率并在变化的条件下改善决策。我们使用标准雅达利游戏进行了一系列实验，以将混合MRL-STDP模型与使用传统强化学习技术（如Q学习和深度Q网络）的基线模型进行比较。评估了各种性能指标，包括学习速度、适应性和跨游戏泛化能力。结果表明，MRL-STDP方法显著加快了智能体达到竞争性能水平的能力，与传统模型相比，学习效率提高了40%，适应性提高了35%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9541/12080787/e3ec3e218cea/pone.0320777.g001.jpg

相似文献

Combining meta reinforcement learning with neural plasticity mechanisms for improved AI performance.

PLoS One. 2025 May 15;20(5):e0320777. doi: 10.1371/journal.pone.0320777. eCollection 2025.

Reinforcement learning with modulated spike timing dependent synaptic plasticity.

J Neurophysiol. 2007 Dec;98(6):3648-65. doi: 10.1152/jn.00364.2007. Epub 2007 Oct 10.

An implementation of reinforcement learning based on spike timing dependent plasticity.

Biol Cybern. 2008 Dec;99(6):517-23. doi: 10.1007/s00422-008-0265-6. Epub 2008 Oct 22.

An unsupervised STDP-based spiking neural network inspired by biologically plausible learning rules and connections.

Neural Netw. 2023 Aug;165:799-808. doi: 10.1016/j.neunet.2023.06.019. Epub 2023 Jun 22.

A spiking network model of decision making employing rewarded STDP.

PLoS One. 2014 Mar 14;9(3):e90821. doi: 10.1371/journal.pone.0090821. eCollection 2014.

Spiking neural networks with different reinforcement learning (RL) schemes in a multiagent setting.

Chin J Physiol. 2010 Dec 31;53(6):447-53.

Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity.

Neural Comput. 2007 Jun;19(6):1468-502. doi: 10.1162/neco.2007.19.6.1468.

Spike-VisNet: A novel framework for visual recognition with FocusLayer-STDP learning.

Neural Netw. 2025 Feb;182:106918. doi: 10.1016/j.neunet.2024.106918. Epub 2024 Nov 26.

Reconciling the STDP and BCM models of synaptic plasticity in a spiking recurrent neural network.

Neural Comput. 2010 Aug;22(8):2059-85. doi: 10.1162/NECO_a_00003-Bush.

Robustness of STDP to spike timing jitter.

Sci Rep. 2018 May 25;8(1):8139. doi: 10.1038/s41598-018-26436-y.

本文引用的文献

Using games to understand the mind.

Nat Hum Behav. 2024 Jun;8(6):1035-1043. doi: 10.1038/s41562-024-01878-9. Epub 2024 Jun 21.

Deep deterministic policy gradient algorithm: A systematic review.

Heliyon. 2024 May 7;10(9):e30697. doi: 10.1016/j.heliyon.2024.e30697. eCollection 2024 May 15.

Scientific discovery in the age of artificial intelligence.

Nature. 2023 Aug;620(7972):47-60. doi: 10.1038/s41586-023-06221-2. Epub 2023 Aug 2.

Value-free reinforcement learning: policy optimization as a minimal model of operant behavior.

Curr Opin Behav Sci. 2021 Oct;41:114-121. doi: 10.1016/j.cobeha.2021.04.020. Epub 2021 May 28.

Reinforcement learning for intelligent healthcare applications: A survey.

Artif Intell Med. 2020 Sep;109:101964. doi: 10.1016/j.artmed.2020.101964. Epub 2020 Sep 28.

Artificial intelligence in cancer research, diagnosis and therapy.

Nat Rev Cancer. 2021 Dec;21(12):747-752. doi: 10.1038/s41568-021-00399-1. Epub 2021 Sep 17.

History of artificial intelligence in medicine.

Gastrointest Endosc. 2020 Oct;92(4):807-812. doi: 10.1016/j.gie.2020.06.040. Epub 2020 Jun 18.

Artificial Intelligence (AI) applications for COVID-19 pandemic.

Diabetes Metab Syndr. 2020 Jul-Aug;14(4):337-339. doi: 10.1016/j.dsx.2020.04.012. Epub 2020 Apr 14.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

将元强化学习与神经可塑性机制相结合以提高人工智能性能。

Combining meta reinforcement learning with neural plasticity mechanisms for improved AI performance.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献