• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

内存受限的部分可观测随机控制及其平均场控制方法。

Memory-Limited Partially Observable Stochastic Control and Its Mean-Field Control Approach.

作者信息

Tottori Takehiro, Kobayashi Tetsuya J

机构信息

Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo 113-8654, Japan.

Institute of Industrial Science, The University of Tokyo, Tokyo 153-8505, Japan.

出版信息

Entropy (Basel). 2022 Nov 3;24(11):1599. doi: 10.3390/e24111599.

DOI:10.3390/e24111599
PMID:36359688
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9689590/
Abstract

Control problems with incomplete information and memory limitation appear in many practical situations. Although partially observable stochastic control (POSC) is a conventional theoretical framework that considers the optimal control problem with incomplete information, it cannot consider memory limitation. Furthermore, POSC cannot be solved in practice except in special cases. In order to address these issues, we propose an alternative theoretical framework, memory-limited POSC (ML-POSC). ML-POSC directly considers memory limitation as well as incomplete information, and it can be solved in practice by employing the technique of mean-field control theory. ML-POSC can generalize the linear-quadratic-Gaussian (LQG) problem to include memory limitation. Because estimation and control are not clearly separated in the LQG problem with memory limitation, the Riccati equation is modified to the partially observable Riccati equation, which improves estimation as well as control. Furthermore, we demonstrate the effectiveness of ML-POSC for a non-LQG problem by comparing it with the local LQG approximation.

摘要

信息不完全和记忆限制下的控制问题出现在许多实际情况中。尽管部分可观测随机控制(POSC)是一个考虑信息不完全时最优控制问题的传统理论框架,但它无法考虑记忆限制。此外,除了特殊情况外,POSC在实际中无法求解。为了解决这些问题,我们提出了一个替代的理论框架,即记忆受限POSC(ML-POSC)。ML-POSC直接考虑记忆限制以及信息不完全,并且可以通过运用平均场控制理论技术在实际中求解。ML-POSC可以将线性二次高斯(LQG)问题进行推广以纳入记忆限制。由于在存在记忆限制的LQG问题中估计和控制没有明确分离,因此将里卡蒂方程修改为部分可观测里卡蒂方程,这改进了估计以及控制。此外,通过将ML-POSC与局部LQG近似进行比较,我们证明了ML-POSC对于非LQG问题的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6df0/9689590/875c5c2bf2bd/entropy-24-01599-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6df0/9689590/56676716e034/entropy-24-01599-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6df0/9689590/722cffa78f95/entropy-24-01599-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6df0/9689590/875c5c2bf2bd/entropy-24-01599-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6df0/9689590/56676716e034/entropy-24-01599-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6df0/9689590/722cffa78f95/entropy-24-01599-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6df0/9689590/875c5c2bf2bd/entropy-24-01599-g003.jpg

相似文献

1
Memory-Limited Partially Observable Stochastic Control and Its Mean-Field Control Approach.内存受限的部分可观测随机控制及其平均场控制方法。
Entropy (Basel). 2022 Nov 3;24(11):1599. doi: 10.3390/e24111599.
2
Forward-Backward Sweep Method for the System of HJB-FP Equations in Memory-Limited Partially Observable Stochastic Control.内存受限部分可观测随机控制中HJB-FP方程组的前后向扫描方法
Entropy (Basel). 2023 Jan 21;25(2):208. doi: 10.3390/e25020208.
3
Decentralized Stochastic Control with Finite-Dimensional Memories: A Memory Limitation Approach.具有有限维记忆的分散随机控制:一种记忆限制方法。
Entropy (Basel). 2023 May 12;25(5):791. doi: 10.3390/e25050791.
4
On stochastic optimal control of partially observable nonlinear quasi Hamiltonian systems.关于部分可观测非线性拟哈密顿系统的随机最优控制
J Zhejiang Univ Sci. 2004 Nov;5(11):1313-7. doi: 10.1631/jzus.2004.1313.
5
Estimation and LQG Control Over Unreliable Network With Acknowledgment Randomly Lost.带确认随机丢失的不可靠网络的估计和 LQG 控制。
IEEE Trans Cybern. 2017 Dec;47(12):4074-4085. doi: 10.1109/TCYB.2016.2597259. Epub 2016 Sep 13.
6
Controller performance analysis with LQG benchmark obtained under closed loop conditions.在闭环条件下获得的基于线性二次高斯(LQG)基准的控制器性能分析。
ISA Trans. 2002 Oct;41(4):521-37. doi: 10.1016/s0019-0578(07)60107-4.
7
Pareto-Optimal Strategy for Linear Mean-Field Stochastic Systems With H Constraint.具有H约束的线性平均场随机系统的帕累托最优策略
IEEE Trans Cybern. 2022 May;52(5):2846-2859. doi: 10.1109/TCYB.2020.3023932. Epub 2022 May 19.
8
Linear feedback control of spreading dynamics in stochastic nonlinear network models: epileptic seizures.随机非线性网络模型中传播动力学的线性反馈控制:癫痫发作
Int IEEE EMBS Conf Neural Eng. 2023 Apr;2023. doi: 10.1109/ner52421.2023.10123896. Epub 2023 May 19.
9
Linear-quadratic-Gaussian control for adaptive optics systems using a hybrid model.基于混合模型的自适应光学系统线性二次高斯控制
J Opt Soc Am A Opt Image Sci Vis. 2009 Jan;26(1):1-9. doi: 10.1364/josaa.26.000001.
10
Pareto Optimal Strategy Under H Constraint for the Mean-Field Stochastic Systems in Infinite Horizon.无限时域下均值场随机系统在H约束下的帕累托最优策略
IEEE Trans Cybern. 2023 Nov;53(11):6963-6976. doi: 10.1109/TCYB.2022.3179605. Epub 2023 Oct 17.

引用本文的文献

1
Understanding Nash epidemics.理解纳什流行病。
Proc Natl Acad Sci U S A. 2025 Mar 4;122(9):e2409362122. doi: 10.1073/pnas.2409362122. Epub 2025 Feb 27.
2
Rational social distancing policy during epidemics with limited healthcare capacity.在医疗能力有限的情况下实施合理的社会隔离政策以应对疫情。
PLoS Comput Biol. 2023 Oct 16;19(10):e1011533. doi: 10.1371/journal.pcbi.1011533. eCollection 2023 Oct.
3
Olfactory search with finite-state controllers.使用有限状态控制器进行嗅觉搜索。

本文引用的文献

1
Forward-Backward Sweep Method for the System of HJB-FP Equations in Memory-Limited Partially Observable Stochastic Control.内存受限部分可观测随机控制中HJB-FP方程组的前后向扫描方法
Entropy (Basel). 2023 Jan 21;25(2):208. doi: 10.3390/e25020208.
2
Alternating the population and control neural networks to solve high-dimensional stochastic mean-field games.交替种群和控制神经网络解决高维随机均值场博弈。
Proc Natl Acad Sci U S A. 2021 Aug 3;118(31). doi: 10.1073/pnas.2024713118.
3
Forward and Backward Bellman Equations Improve the Efficiency of the EM Algorithm for DEC-POMDP.
Proc Natl Acad Sci U S A. 2023 Aug 22;120(34):e2304230120. doi: 10.1073/pnas.2304230120. Epub 2023 Aug 14.
4
Rational social distancing in epidemics with uncertain vaccination timing.在疫苗接种时机不确定的流行病中进行合理的社交距离隔离。
PLoS One. 2023 Jul 21;18(7):e0288963. doi: 10.1371/journal.pone.0288963. eCollection 2023.
5
Decentralized Stochastic Control with Finite-Dimensional Memories: A Memory Limitation Approach.具有有限维记忆的分散随机控制:一种记忆限制方法。
Entropy (Basel). 2023 May 12;25(5):791. doi: 10.3390/e25050791.
6
Forward-Backward Sweep Method for the System of HJB-FP Equations in Memory-Limited Partially Observable Stochastic Control.内存受限部分可观测随机控制中HJB-FP方程组的前后向扫描方法
Entropy (Basel). 2023 Jan 21;25(2):208. doi: 10.3390/e25020208.
前向和后向贝尔曼方程提高了用于去中心化部分可观测马尔可夫决策过程的期望最大化算法的效率。
Entropy (Basel). 2021 Apr 29;23(5):551. doi: 10.3390/e23050551.
4
Connection between the Bacterial Chemotactic Network and Optimal Filtering.细菌趋化网络与最优滤波之间的联系。
Phys Rev Lett. 2021 Mar 26;126(12):128102. doi: 10.1103/PhysRevLett.126.128102.
5
A machine learning framework for solving high-dimensional mean field game and mean field control problems.一种用于解决高维平均场博弈和平均场控制问题的机器学习框架。
Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9183-9193. doi: 10.1073/pnas.1922204117. Epub 2020 Apr 9.