• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有泛化能力的投射模拟

Projective simulation with generalization.

作者信息

Melnikov Alexey A, Makmal Adi, Dunjko Vedran, Briegel Hans J

机构信息

Institute for Theoretical Physics, University of Innsbruck, Technikerstraße 21a, 6020, Innsbruck, Austria.

Institute for Quantum Optics and Quantum Information, Austrian Academy of Sciences, Technikerstraße 21a, 6020, Innsbruck, Austria.

出版信息

Sci Rep. 2017 Oct 31;7(1):14430. doi: 10.1038/s41598-017-14740-y.

DOI:10.1038/s41598-017-14740-y
PMID:29089575
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5663920/
Abstract

The ability to generalize is an important feature of any intelligent agent. Not only because it may allow the agent to cope with large amounts of data, but also because in some environments, an agent with no generalization capabilities cannot learn. In this work we outline several criteria for generalization, and present a dynamic and autonomous machinery that enables projective simulation agents to meaningfully generalize. Projective simulation, a novel, physical approach to artificial intelligence, was recently shown to perform well in standard reinforcement learning problems, with applications in advanced robotics as well as quantum experiments. Both the basic projective simulation model and the presented generalization machinery are based on very simple principles. This allows us to provide a full analytical analysis of the agent's performance and to illustrate the benefit the agent gains by generalizing. Specifically, we show that already in basic (but extreme) environments, learning without generalization may be impossible, and demonstrate how the presented generalization machinery enables the projective simulation agent to learn.

摘要

泛化能力是任何智能体的一个重要特征。这不仅是因为它可能使智能体能够处理大量数据,还因为在某些环境中,没有泛化能力的智能体无法学习。在这项工作中,我们概述了几个泛化标准,并提出了一种动态自主机制,使投影模拟智能体能够进行有意义的泛化。投影模拟是一种新颖的人工智能物理方法,最近被证明在标准强化学习问题中表现良好,可应用于先进机器人技术以及量子实验。基本的投影模拟模型和所提出的泛化机制都基于非常简单的原理。这使我们能够对智能体的性能进行全面的分析,并说明智能体通过泛化所获得的益处。具体而言,我们表明,即使在基本(但极端)环境中,没有泛化的学习可能是不可能的,并展示了所提出的泛化机制如何使投影模拟智能体能够学习。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/ad85e76d3900/41598_2017_14740_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/d26660e1952d/41598_2017_14740_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/b547952e45df/41598_2017_14740_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/07e5fba1d8a9/41598_2017_14740_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/2c210dbae0fe/41598_2017_14740_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/67ce0a38ab7a/41598_2017_14740_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/fb2a700e6a21/41598_2017_14740_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/5ed930fe2cf7/41598_2017_14740_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/154d271930be/41598_2017_14740_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/ad85e76d3900/41598_2017_14740_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/d26660e1952d/41598_2017_14740_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/b547952e45df/41598_2017_14740_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/07e5fba1d8a9/41598_2017_14740_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/2c210dbae0fe/41598_2017_14740_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/67ce0a38ab7a/41598_2017_14740_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/fb2a700e6a21/41598_2017_14740_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/5ed930fe2cf7/41598_2017_14740_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/154d271930be/41598_2017_14740_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df16/5663920/ad85e76d3900/41598_2017_14740_Fig9_HTML.jpg

相似文献

1
Projective simulation with generalization.具有泛化能力的投射模拟
Sci Rep. 2017 Oct 31;7(1):14430. doi: 10.1038/s41598-017-14740-y.
2
Projective simulation for artificial intelligence.人工智能的投影模拟。
Sci Rep. 2012;2:400. doi: 10.1038/srep00400. Epub 2012 May 15.
3
Generalization Enhancement of Visual Reinforcement Learning through Internal States.通过内部状态增强视觉强化学习的泛化能力
Sensors (Basel). 2024 Jul 12;24(14):4513. doi: 10.3390/s24144513.
4
On the convergence of projective-simulation-based reinforcement learning in Markov decision processes.基于投影模拟的强化学习在马尔可夫决策过程中的收敛性
Quantum Mach Intell. 2020;2(2):13. doi: 10.1007/s42484-020-00023-9. Epub 2020 Nov 5.
5
Stochastic abstract policies: generalizing knowledge to improve reinforcement learning.随机抽象策略:推广知识以改进强化学习。
IEEE Trans Cybern. 2015 Jan;45(1):77-88. doi: 10.1109/TCYB.2014.2319733. Epub 2014 May 13.
6
Adaptive quantum computation in changing environments using projective simulation.在变化环境中使用投影模拟的自适应量子计算。
Sci Rep. 2015 Aug 11;5:12874. doi: 10.1038/srep12874.
7
Experimental quantum speed-up in reinforcement learning agents.实验性量子强化学习代理中的速度提升。
Nature. 2021 Mar;591(7849):229-233. doi: 10.1038/s41586-021-03242-7. Epub 2021 Mar 10.
8
Reinforcement Learning for Improving Agent Design.强化学习在改进智能体设计中的应用。
Artif Life. 2019 Fall;25(4):352-365. doi: 10.1162/artl_a_00301. Epub 2019 Nov 7.
9
MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning.MetaDrive:为可泛化强化学习合成多样化驾驶场景
IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3461-3475. doi: 10.1109/TPAMI.2022.3190471. Epub 2023 Feb 3.
10
Compensating for Sensing Failures via Delegation in Human-AI Hybrid Systems.通过在人机混合系统中委托来弥补感知故障。
Sensors (Basel). 2023 Mar 24;23(7):3409. doi: 10.3390/s23073409.

引用本文的文献

1
Free Energy Projective Simulation (FEPS): Active inference with interpretability.自由能投射模拟(FEPS):具有可解释性的主动推理
PLoS One. 2025 Sep 4;20(9):e0331047. doi: 10.1371/journal.pone.0331047. eCollection 2025.
2
Elucidating simulated equivalence responding through dynamic visualization of structural connectivity and relational density.通过结构连通性和关系密度的动态可视化来阐明模拟等效反应。
Front Artif Intell. 2025 Aug 5;8:1618678. doi: 10.3389/frai.2025.1618678. eCollection 2025.
3
How a Minimal Learning Agent can Infer the Existence of Unobserved Variables in a Complex Environment.

本文引用的文献

1
Active learning machine learns to create new quantum experiments.主动学习机学会了创建新的量子实验。
Proc Natl Acad Sci U S A. 2018 Feb 6;115(6):1221-1226. doi: 10.1073/pnas.1714936115. Epub 2018 Jan 18.
2
Quantum-Enhanced Machine Learning.量子增强机器学习
Phys Rev Lett. 2016 Sep 23;117(13):130501. doi: 10.1103/PhysRevLett.117.130501. Epub 2016 Sep 20.
3
Coherent controlization using superconducting qubits.使用超导量子比特的相干控制
一个最小学习智能体如何在复杂环境中推断未观察到的变量的存在。
Minds Mach (Dordr). 2023;33(1):185-219. doi: 10.1007/s11023-022-09619-5. Epub 2022 Dec 29.
4
Skill Learning by Autonomous Robotic Playing Using Active Learning and Exploratory Behavior Composition.通过主动学习和探索性行为组合实现自主机器人游戏的技能学习
Front Robot AI. 2020 Apr 3;7:42. doi: 10.3389/frobt.2020.00042. eCollection 2020.
5
Development of swarm behavior in artificial learning agents that adapt to different foraging environments.人工学习代理中适应不同觅食环境的群体行为的发展。
PLoS One. 2020 Dec 18;15(12):e0243628. doi: 10.1371/journal.pone.0243628. eCollection 2020.
6
On the convergence of projective-simulation-based reinforcement learning in Markov decision processes.基于投影模拟的强化学习在马尔可夫决策过程中的收敛性
Quantum Mach Intell. 2020;2(2):13. doi: 10.1007/s42484-020-00023-9. Epub 2020 Nov 5.
7
A Stochastic Process Model for Free Agency under Indeterminism.不确定性下自由球员的随机过程模型
Dialectica (Bern). 2018 Jun;72(2):219-252. doi: 10.1111/1746-8361.12222. Epub 2018 Aug 24.
8
Active learning machine learns to create new quantum experiments.主动学习机学会了创建新的量子实验。
Proc Natl Acad Sci U S A. 2018 Feb 6;115(6):1221-1226. doi: 10.1073/pnas.1714936115. Epub 2018 Jan 18.
9
Coherent controlization using superconducting qubits.使用超导量子比特的相干控制
Sci Rep. 2015 Dec 15;5:18036. doi: 10.1038/srep18036.
10
Adaptive quantum computation in changing environments using projective simulation.在变化环境中使用投影模拟的自适应量子计算。
Sci Rep. 2015 Aug 11;5:12874. doi: 10.1038/srep12874.
Sci Rep. 2015 Dec 15;5:18036. doi: 10.1038/srep18036.
4
Adaptive quantum computation in changing environments using projective simulation.在变化环境中使用投影模拟的自适应量子计算。
Sci Rep. 2015 Aug 11;5:12874. doi: 10.1038/srep12874.
5
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
6
Hierarchical reinforcement learning and decision making.分层强化学习与决策。
Curr Opin Neurobiol. 2012 Dec;22(6):956-62. doi: 10.1016/j.conb.2012.05.008. Epub 2012 Jun 11.
7
Projective simulation for artificial intelligence.人工智能的投影模拟。
Sci Rep. 2012;2:400. doi: 10.1038/srep00400. Epub 2012 May 15.