• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

迈向受大脑启发的系统:用于模拟自动驾驶智能体的深度循环强化学习

Toward a Brain-Inspired System: Deep Recurrent Reinforcement Learning for a Simulated Self-Driving Agent.

作者信息

Chen Jieneng, Chen Jingye, Zhang Ruiming, Hu Xiaobin

机构信息

Department of Computer Science, College of Electronics and Information Engineering, Tongji University, Shanghai, China.

School of Computer Science, Fudan University, Shanghai, China.

出版信息

Front Neurorobot. 2019 Jun 28;13:40. doi: 10.3389/fnbot.2019.00040. eCollection 2019.

DOI:10.3389/fnbot.2019.00040
PMID:31316366
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6611356/
Abstract

An effective way to achieve intelligence is to simulate various intelligent behaviors in the human brain. In recent years, bio-inspired learning methods have emerged, and they are different from the classical mathematical programming principle. From the perspective of brain inspiration, reinforcement learning has gained additional interest in solving decision-making tasks as increasing neuroscientific research demonstrates that significant links exist between reinforcement learning and specific neural substrates. Because of the tremendous research that focuses on human brains and reinforcement learning, scientists have investigated how robots can autonomously tackle complex tasks in the form of making a self-driving agent control in a human-like way. In this study, we propose an end-to-end architecture using novel deep-Q-network architecture in conjunction with a recurrence to resolve the problem in the field of simulated self-driving. The main contribution of this study is that we trained the driving agent using a brain-inspired trial-and-error technique, which was in line with the real world situation. Besides, there are three innovations in the proposed learning network: raw screen outputs are the only information which the driving agent can rely on, a weighted layer that enhances the differences of the lengthy episode, and a modified replay mechanism that overcomes the problem of sparsity and accelerates learning. The proposed network was trained and tested under a third-party OpenAI Gym environment. After training for several episodes, the resulting driving agent performed advanced behaviors in the given scene. We hope that in the future, the proposed brain-inspired learning system would inspire practicable self-driving control solutions.

摘要

实现智能的一种有效方法是模拟人类大脑中的各种智能行为。近年来,受生物启发的学习方法应运而生,它们不同于经典的数学编程原理。从大脑启发的角度来看,强化学习在解决决策任务方面获得了更多关注,因为越来越多的神经科学研究表明,强化学习与特定的神经基质之间存在着重要联系。由于对人类大脑和强化学习的大量研究,科学家们研究了机器人如何以类似人类的方式进行自动驾驶代理控制的形式自主处理复杂任务。在本研究中,我们提出了一种端到端架构,该架构使用新颖的深度Q网络架构并结合循环来解决模拟自动驾驶领域中的问题。本研究的主要贡献在于,我们使用受大脑启发的试错技术训练驾驶代理,这与现实世界的情况相符。此外,所提出的学习网络有三项创新:原始屏幕输出是驾驶代理唯一可以依赖的信息,一个加权层增强了长情节的差异,以及一个改进的重放机制,克服了稀疏性问题并加速了学习。所提出的网络在第三方OpenAI Gym环境下进行了训练和测试。经过几轮训练后,生成的驾驶代理在给定场景中表现出先进的行为。我们希望在未来,所提出的受大脑启发的学习系统能够激发切实可行的自动驾驶控制解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db67/6611356/25c67e277810/fnbot-13-00040-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db67/6611356/0bbce8f32d1f/fnbot-13-00040-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db67/6611356/8becabbccaff/fnbot-13-00040-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db67/6611356/68c650c323e8/fnbot-13-00040-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db67/6611356/25c67e277810/fnbot-13-00040-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db67/6611356/0bbce8f32d1f/fnbot-13-00040-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db67/6611356/8becabbccaff/fnbot-13-00040-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db67/6611356/68c650c323e8/fnbot-13-00040-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db67/6611356/25c67e277810/fnbot-13-00040-g0005.jpg

相似文献

1
Toward a Brain-Inspired System: Deep Recurrent Reinforcement Learning for a Simulated Self-Driving Agent.迈向受大脑启发的系统:用于模拟自动驾驶智能体的深度循环强化学习
Front Neurorobot. 2019 Jun 28;13:40. doi: 10.3389/fnbot.2019.00040. eCollection 2019.
2
Intelligent control of self-driving vehicles based on adaptive sampling supervised actor-critic and human driving experience.基于自适应采样监督式智能体-评论家算法和人类驾驶经验的自动驾驶车辆智能控制
Math Biosci Eng. 2024 May 24;21(5):6077-6096. doi: 10.3934/mbe.2024267.
3
Fear-Neuro-Inspired Reinforcement Learning for Safe Autonomous Driving.基于恐惧神经的强化学习在自动驾驶中的安全应用。
IEEE Trans Pattern Anal Mach Intell. 2024 Jan;46(1):267-279. doi: 10.1109/TPAMI.2023.3322426. Epub 2023 Dec 5.
4
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
5
Research on deep reinforcement learning basketball robot shooting skills improvement based on end to end architecture and multi-modal perception.基于端到端架构和多模态感知的深度强化学习篮球机器人投篮技术改进研究
Front Neurorobot. 2023 Oct 13;17:1274543. doi: 10.3389/fnbot.2023.1274543. eCollection 2023.
6
Intelligent inverse treatment planning via deep reinforcement learning, a proof-of-principle study in high dose-rate brachytherapy for cervical cancer.通过深度强化学习实现智能反演治疗计划,宫颈癌高剂量率近距离放疗的原理验证研究。
Phys Med Biol. 2019 May 29;64(11):115013. doi: 10.1088/1361-6560/ab18bf.
7
An End-to-End Deep Reinforcement Learning-Based Intelligent Agent Capable of Autonomous Exploration in Unknown Environments.一种基于端到端深度学习的智能代理,能够在未知环境中自主探索。
Sensors (Basel). 2018 Oct 22;18(10):3575. doi: 10.3390/s18103575.
8
A Brain-Inspired Decision-Making Linear Neural Network and Its Application in Automatic Drive.一种基于大脑启发的决策线性神经网络及其在自动驾驶中的应用。
Sensors (Basel). 2021 Jan 25;21(3):794. doi: 10.3390/s21030794.
9
Intelligent Decision-Making of Scheduling for Dynamic Permutation Flowshop via Deep Reinforcement Learning.基于深度强化学习的动态置换流水车间调度智能决策
Sensors (Basel). 2021 Feb 2;21(3):1019. doi: 10.3390/s21031019.
10
Intelligent Land-Vehicle Model Transfer Trajectory Planning Method Based on Deep Reinforcement Learning.基于深度强化学习的智能车模型转换轨迹规划方法。
Sensors (Basel). 2018 Sep 1;18(9):2905. doi: 10.3390/s18092905.

引用本文的文献

1
Replay in Deep Learning: Current Approaches and Missing Biological Elements.深度学习中的再现:当前方法和缺失的生物学元素。
Neural Comput. 2021 Oct 12;33(11):2908-2950. doi: 10.1162/neco_a_01433.
2
Decoding Multiple Sound-Categories in the Auditory Cortex by Neural Networks: An fNIRS Study.通过神经网络解码听觉皮层中的多个声音类别:一项功能近红外光谱研究。
Front Hum Neurosci. 2021 Apr 28;15:636191. doi: 10.3389/fnhum.2021.636191. eCollection 2021.

本文引用的文献

1
Reinforcement Learning, Fast and Slow.强化学习:快与慢。
Trends Cogn Sci. 2019 May;23(5):408-422. doi: 10.1016/j.tics.2019.02.006. Epub 2019 Apr 16.
2
Neuroscience-Inspired Artificial Intelligence.神经科学启发的人工智能。
Neuron. 2017 Jul 19;95(2):245-258. doi: 10.1016/j.neuron.2017.06.011.
3
Building machines that learn and think like people.建造像人一样学习和思考的机器。
Behav Brain Sci. 2017 Jan;40:e253. doi: 10.1017/S0140525X16001837. Epub 2016 Nov 24.
4
China Brain Project: Basic Neuroscience, Brain Diseases, and Brain-Inspired Computing.中国脑计划:基础神经科学、脑疾病和类脑计算。
Neuron. 2016 Nov 2;92(3):591-596. doi: 10.1016/j.neuron.2016.10.050.
5
Toward an Integration of Deep Learning and Neuroscience.迈向深度学习与神经科学的整合。
Front Comput Neurosci. 2016 Sep 14;10:94. doi: 10.3389/fncom.2016.00094. eCollection 2016.
6
Reinforcement Learning and Episodic Memory in Humans and Animals: An Integrative Framework.人类和动物中的强化学习与情景记忆:一个综合框架
Annu Rev Psychol. 2017 Jan 3;68:101-128. doi: 10.1146/annurev-psych-122414-033625. Epub 2016 Sep 2.
7
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
8
Goals and habits in the brain.大脑中的目标和习惯。
Neuron. 2013 Oct 16;80(2):312-25. doi: 10.1016/j.neuron.2013.09.007.
9
Reinforcement learning of motor skills with policy gradients.基于策略梯度的运动技能强化学习。
Neural Netw. 2008 May;21(4):682-97. doi: 10.1016/j.neunet.2008.02.003. Epub 2008 Apr 26.
10
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.