• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

优先考虑基于点的部分可观测马尔可夫决策过程(POMDP)求解器。

Prioritizing point-based POMDP solvers.

作者信息

Shani Guy, Brafman Ronen I, Shimony Solomon Eyal

机构信息

Microsoft Research, Redmond, WA 98052 USA.

出版信息

IEEE Trans Syst Man Cybern B Cybern. 2008 Dec;38(6):1592-605. doi: 10.1109/TSMCB.2008.928222.

DOI:10.1109/TSMCB.2008.928222
PMID:19022729
Abstract

Recent scaling up of partially observable Markov decision process (POMDP) solvers toward realistic applications is largely due to point-based methods that quickly converge to an approximate solution for medium-sized domains. These algorithms compute a value function for a finite reachable set of belief points, using backup operations. Point-based algorithms differ on the selection of the set of belief points and on the order by which backup operations are executed on the selected belief points. We first show how current algorithms execute a large number of backups that can be removed without reducing the quality of the value function. We demonstrate that the ordering of backup operations on a predefined set of belief points is important. In the simpler domain of MDP solvers, prioritizing the order of equivalent backup operations on states is known to speed up convergence. We generalize the notion of prioritized backups to the POMDP framework, showing how existing algorithms can be improved by prioritizing backups. We also present a new algorithm, which is the prioritized value iteration, and show empirically that it outperforms current point-based algorithms. Finally, a new empirical evaluation measure (in addition to the standard runtime comparison), which is based on the number of atomic operations and the number of belief points, is proposed in order to provide more accurate benchmark comparisons.

摘要

近期,部分可观测马尔可夫决策过程(POMDP)求解器在实际应用中的扩展,很大程度上归功于基于点的方法,这些方法能快速收敛到中等规模领域的近似解。这些算法使用备份操作,为有限的可达信念点集计算价值函数。基于点的算法在信念点集的选择以及对所选信念点执行备份操作的顺序上存在差异。我们首先展示当前算法如何执行大量可在不降低价值函数质量的情况下移除的备份操作。我们证明在预定义的信念点集上备份操作的顺序很重要。在更简单的MDP求解器领域,已知对状态上等效备份操作的顺序进行优先级排序可加快收敛速度。我们将优先级备份的概念推广到POMDP框架,展示如何通过对备份操作进行优先级排序来改进现有算法。我们还提出了一种新算法,即优先级值迭代,并通过实验表明它优于当前基于点的算法。最后,为了提供更准确的基准比较,提出了一种新的实证评估指标(除了标准的运行时比较),该指标基于原子操作的数量和信念点的数量。

相似文献

1
Prioritizing point-based POMDP solvers.优先考虑基于点的部分可观测马尔可夫决策过程(POMDP)求解器。
IEEE Trans Syst Man Cybern B Cybern. 2008 Dec;38(6):1592-605. doi: 10.1109/TSMCB.2008.928222.
2
Evaluating point-based POMDP solvers on multicore machines.在多核机器上评估基于点的部分可观测马尔可夫决策过程(POMDP)求解器。
IEEE Trans Syst Man Cybern B Cybern. 2010 Aug;40(4):1062-74. doi: 10.1109/TSMCB.2009.2034015. Epub 2009 Nov 13.
3
Partially observable Markov decision processes and performance sensitivity analysis.部分可观测马尔可夫决策过程与性能灵敏度分析。
IEEE Trans Syst Man Cybern B Cybern. 2008 Dec;38(6):1645-51. doi: 10.1109/TSMCB.2008.927711.
4
Task-based decomposition of factored POMDPs.基于任务的因子化 POMDP 分解。
IEEE Trans Cybern. 2014 Feb;44(2):208-16. doi: 10.1109/TCYB.2013.2252009.
5
Value-directed human behavior analysis from video using partially observable Markov decision processes.使用部分可观测马尔可夫决策过程从视频中进行价值导向的人类行为分析。
IEEE Trans Pattern Anal Mach Intell. 2007 Jul;29(7):1118-32. doi: 10.1109/TPAMI.2007.1145.
6
An approach to solve group-decision-making problems with ordinal interval numbers.一种解决带有序数区间数的群体决策问题的方法。
IEEE Trans Syst Man Cybern B Cybern. 2010 Oct;40(5):1413-23. doi: 10.1109/TSMCB.2009.2039477. Epub 2010 Feb 17.
7
Approximate matching of digital point sets using a novel angular tree.使用新型角树对数字点集进行近似匹配。
IEEE Trans Pattern Anal Mach Intell. 2009 May;31(5):769-82. doi: 10.1109/TPAMI.2007.70812.
8
Optimal combination of nested clusters by a greedy approximation algorithm.通过贪婪近似算法实现嵌套聚类的最优组合。
IEEE Trans Pattern Anal Mach Intell. 2009 Nov;31(11):2083-7. doi: 10.1109/TPAMI.2009.75.
9
Optimal decision rule with class-selective rejection and performance constraints.具有类别选择性拒绝和性能约束的最优决策规则。
IEEE Trans Pattern Anal Mach Intell. 2009 Nov;31(11):2073-82. doi: 10.1109/TPAMI.2008.239.
10
Statistical instance-based pruning in ensembles of independent classifiers.独立分类器集成中的基于统计实例的剪枝
IEEE Trans Pattern Anal Mach Intell. 2009 Feb;31(2):364-9. doi: 10.1109/TPAMI.2008.204.