• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

深度强化学习在 NS-SHAFT 游戏信号控制中的应用。

Application of Deep Reinforcement Learning to NS-SHAFT Game Signal Control.

机构信息

Department of Computer Science and Information Engineering, National Yunlin University of Science and Technology, Douliu 640301, Taiwan.

Intelligence Recognition Industry Service Research Center (IR-IS Research Center), National Yunlin University of Science and Technology, Douliu 640301, Taiwan.

出版信息

Sensors (Basel). 2022 Jul 14;22(14):5265. doi: 10.3390/s22145265.

DOI:10.3390/s22145265
PMID:35890943
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9317465/
Abstract

Reinforcement learning (RL) with both exploration and exploit abilities is applied to games to demonstrate that it can surpass human performance. This paper mainly applies Deep Q-Network (DQN), which combines reinforcement learning and deep learning to the real-time action response of NS-SHAFT game with Cheat Engine as the API of game information autonomously. Based on a personal computer, we build an experimental learning environment that automatically captures the NS-SHAFT's frame, which is provided to DQN to decide the action of moving left, moving right, or stay in same location, survey different parameters: such as the sample frequency, different reward function, and batch size, etc. The experiment found that the relevant parameter settings have a certain degree of influence on the DQN learning effect. Moreover, we use Cheat Engine as the API of NS-SHAFT game information to locate the relevant values in the NS-SHAFT game, and then read the relevant values to achieve the operation of the overall experimental platform and the calculation of Reward. Accordingly, we successfully establish an instant learning environment and instant game training for the NS-SHAFT game.

摘要

强化学习(RL)具有探索和利用能力,将其应用于游戏中可以证明它可以超越人类表现。本文主要将深度 Q 网络(DQN)应用于 NS-SHAFT 游戏的实时动作响应,使用 Cheat Engine 作为游戏信息的 API 进行自主信息收集。我们基于个人计算机,构建了一个实验学习环境,该环境可以自动捕获 NS-SHAFT 的帧,并将其提供给 DQN 以决定向左移动、向右移动或原地停留的动作,测试了不同的参数:例如样本频率、不同的奖励函数和批量大小等。实验发现,相关参数设置对 DQN 的学习效果有一定的影响。此外,我们使用 Cheat Engine 作为 NS-SHAFT 游戏信息的 API,定位 NS-SHAFT 游戏中的相关值,然后读取相关值以实现整个实验平台的操作和 Reward 的计算。因此,我们成功地为 NS-SHAFT 游戏建立了即时学习环境和即时游戏训练。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/045eeda60793/sensors-22-05265-g022a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/3744383ef554/sensors-22-05265-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/5affd50dcee6/sensors-22-05265-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/29223d9e2d4e/sensors-22-05265-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/f5132aef3ec1/sensors-22-05265-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/a88e3f9184e0/sensors-22-05265-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/e5ef1beba5d4/sensors-22-05265-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/272fd0fb9bb4/sensors-22-05265-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/2e96d3f81740/sensors-22-05265-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/25275e8caf12/sensors-22-05265-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/b546009c75fc/sensors-22-05265-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/2308d1149f60/sensors-22-05265-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/e161f681fd4b/sensors-22-05265-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/558a5203bf63/sensors-22-05265-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/f0b4cffb9eaa/sensors-22-05265-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/867c7ff071e0/sensors-22-05265-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/78241d2cefdd/sensors-22-05265-g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/e5d90edda249/sensors-22-05265-g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/4e118b8d9939/sensors-22-05265-g018.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/a1d40955d573/sensors-22-05265-g019.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/5379db398bfc/sensors-22-05265-g020.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/71c642832408/sensors-22-05265-g021.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/045eeda60793/sensors-22-05265-g022a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/3744383ef554/sensors-22-05265-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/5affd50dcee6/sensors-22-05265-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/29223d9e2d4e/sensors-22-05265-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/f5132aef3ec1/sensors-22-05265-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/a88e3f9184e0/sensors-22-05265-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/e5ef1beba5d4/sensors-22-05265-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/272fd0fb9bb4/sensors-22-05265-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/2e96d3f81740/sensors-22-05265-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/25275e8caf12/sensors-22-05265-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/b546009c75fc/sensors-22-05265-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/2308d1149f60/sensors-22-05265-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/e161f681fd4b/sensors-22-05265-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/558a5203bf63/sensors-22-05265-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/f0b4cffb9eaa/sensors-22-05265-g014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/867c7ff071e0/sensors-22-05265-g015.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/78241d2cefdd/sensors-22-05265-g016.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/e5d90edda249/sensors-22-05265-g017.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/4e118b8d9939/sensors-22-05265-g018.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/a1d40955d573/sensors-22-05265-g019.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/5379db398bfc/sensors-22-05265-g020.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/71c642832408/sensors-22-05265-g021.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dcf1/9317465/045eeda60793/sensors-22-05265-g022a.jpg

相似文献

1
Application of Deep Reinforcement Learning to NS-SHAFT Game Signal Control.深度强化学习在 NS-SHAFT 游戏信号控制中的应用。
Sensors (Basel). 2022 Jul 14;22(14):5265. doi: 10.3390/s22145265.
2
Multisource Transfer Double DQN Based on Actor Learning.基于演员学习的多源转移双 DQN。
IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2227-2238. doi: 10.1109/TNNLS.2018.2806087.
3
Deep reinforcement learning for automated radiation adaptation in lung cancer.深度强化学习在肺癌放射自适应中的应用。
Med Phys. 2017 Dec;44(12):6690-6705. doi: 10.1002/mp.12625. Epub 2017 Nov 14.
4
Pursuit and Evasion Strategy of a Differential Game Based on Deep Reinforcement Learning.基于深度强化学习的微分对策追逃策略
Front Bioeng Biotechnol. 2022 Mar 22;10:827408. doi: 10.3389/fbioe.2022.827408. eCollection 2022.
5
Combining STDP and binary networks for reinforcement learning from images and sparse rewards.结合 STDP 和二进制网络,从图像和稀疏奖励中进行强化学习。
Neural Netw. 2021 Dec;144:496-506. doi: 10.1016/j.neunet.2021.09.010. Epub 2021 Sep 17.
6
Deep Reinforcement Learning With Modulated Hebbian Plus Q-Network Architecture.具有调制赫布型加Q网络架构的深度强化学习
IEEE Trans Neural Netw Learn Syst. 2022 May;33(5):2045-2056. doi: 10.1109/TNNLS.2021.3110281. Epub 2022 May 2.
7
Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning.受限深度Q学习逐步逼近普通Q学习。
Front Neurorobot. 2019 Dec 10;13:103. doi: 10.3389/fnbot.2019.00103. eCollection 2019.
8
MonkeyKing: Adaptive Parameter Tuning on Big Data Platforms with Deep Reinforcement Learning.孙悟空:基于深度强化学习的大数据平台自适应参数调整。
Big Data. 2020 Aug;8(4):270-290. doi: 10.1089/big.2019.0123. Epub 2020 Jul 10.
9
Generalized Single-Vehicle-Based Graph Reinforcement Learning for Decision-Making in Autonomous Driving.基于广义单车图的强化学习在自动驾驶决策中的应用。
Sensors (Basel). 2022 Jun 29;22(13):4935. doi: 10.3390/s22134935.
10
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.

本文引用的文献

1
Intelligent Healthcare System Using Patients Confidential Data Communication in Electrocardiogram Signals.利用心电图信号中患者保密数据通信的智能医疗系统。
Front Aging Neurosci. 2022 Apr 20;14:870844. doi: 10.3389/fnagi.2022.870844. eCollection 2022.
2
Application of Machine Learning in Air Hockey Interactive Control System.机器学习在空气曲棍球互动控制系统中的应用。
Sensors (Basel). 2020 Dec 17;20(24):7233. doi: 10.3390/s20247233.
3
Mastering the game of Go with deep neural networks and tree search.用深度神经网络和树搜索掌握围棋游戏。
Nature. 2016 Jan 28;529(7587):484-9. doi: 10.1038/nature16961.
4
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
5
Deep learning in neural networks: an overview.神经网络中的深度学习:综述。
Neural Netw. 2015 Jan;61:85-117. doi: 10.1016/j.neunet.2014.09.003. Epub 2014 Oct 13.