• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于隐马尔可夫模型贝叶斯推理的深度强化学习数据收集

Deep Reinforcement Learning Data Collection for Bayesian Inference of Hidden Markov Models.

作者信息

Alali Mohammad, Imani Mahdi

机构信息

Department of Electrical and Computer Engineering at Northeastern University.

出版信息

IEEE Trans Artif Intell. 2025 May;6(5):1217-1232. doi: 10.1109/tai.2024.3515939. Epub 2024 Dec 12.

DOI:10.1109/tai.2024.3515939
PMID:40313356
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12045110/
Abstract

Hidden Markov Models (HMMs) are a powerful class of dynamical models for representing complex systems that are partially observed through sensory data. Existing data collection methods for HMMs, typically based on active learning or heuristic approaches, face challenges in terms of efficiency in stochastic domains with costly data. This paper introduces a Bayesian lookahead data collection method for inferring HMMs with finite state and parameter spaces. The method optimizes data collection under uncertainty using a belief state that captures the joint distribution of system states and models. Unlike traditional approaches that prioritize short-term gains, this policy accounts for the long-term impact of data collection decisions to improve inference performance over time. We develop a deep reinforcement learning policy that approximates the optimal Bayesian solution by simulating system trajectories offline. This pre-trained policy can be executed in real-time, dynamically adapting to new conditions as data is collected. The proposed framework supports a wide range of inference objectives, including point-based, distribution-based, and causal inference. Experimental results across three distinct systems demonstrate significant improvements in inference accuracy and robustness, showcasing the effectiveness of the approach in uncertain and data-limited environments.

摘要

隐马尔可夫模型(HMMs)是一类强大的动态模型,用于表示通过感官数据部分观测到的复杂系统。现有的HMM数据收集方法通常基于主动学习或启发式方法,在具有高成本数据的随机领域中,这些方法在效率方面面临挑战。本文介绍了一种用于推断具有有限状态和参数空间的HMM的贝叶斯前瞻数据收集方法。该方法使用捕捉系统状态和模型联合分布的信念状态,在不确定性下优化数据收集。与优先考虑短期收益的传统方法不同,此策略考虑了数据收集决策的长期影响,以随着时间的推移提高推断性能。我们开发了一种深度强化学习策略,通过离线模拟系统轨迹来近似最优贝叶斯解。这种预训练策略可以实时执行,随着数据的收集动态适应新情况。所提出的框架支持广泛的推断目标,包括基于点的、基于分布的和因果推断。在三个不同系统上的实验结果表明,在推断准确性和鲁棒性方面有显著提高,展示了该方法在不确定和数据有限环境中的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/80c0b49352d3/nihms-2050407-f0014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/29e5691c1a38/nihms-2050407-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/858351cb6f01/nihms-2050407-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/e08c9f1ef496/nihms-2050407-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/34f14bb66690/nihms-2050407-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/b7d604c41f06/nihms-2050407-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/d47ce28be140/nihms-2050407-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/793862e54360/nihms-2050407-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/19cd630ce49a/nihms-2050407-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/7bd3ddb4ed7e/nihms-2050407-f0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/813b8e615726/nihms-2050407-f0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/1b830c752243/nihms-2050407-f0013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/80c0b49352d3/nihms-2050407-f0014.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/29e5691c1a38/nihms-2050407-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/858351cb6f01/nihms-2050407-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/e08c9f1ef496/nihms-2050407-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/34f14bb66690/nihms-2050407-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/b7d604c41f06/nihms-2050407-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/d47ce28be140/nihms-2050407-f0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/793862e54360/nihms-2050407-f0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/19cd630ce49a/nihms-2050407-f0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/7bd3ddb4ed7e/nihms-2050407-f0011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/813b8e615726/nihms-2050407-f0012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/1b830c752243/nihms-2050407-f0013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6ba0/12045110/80c0b49352d3/nihms-2050407-f0014.jpg

相似文献

1
Deep Reinforcement Learning Data Collection for Bayesian Inference of Hidden Markov Models.用于隐马尔可夫模型贝叶斯推理的深度强化学习数据收集
IEEE Trans Artif Intell. 2025 May;6(5):1217-1232. doi: 10.1109/tai.2024.3515939. Epub 2024 Dec 12.
2
Optimal Inference of Hidden Markov Models Through Expert-Acquired Data.通过专家获取的数据对隐马尔可夫模型进行最优推断。
IEEE Trans Artif Intell. 2024 Aug;5(8):3985-4000. doi: 10.1109/tai.2024.3358261. Epub 2024 Jan 24.
3
Deep Reinforcement Learning Sensor Scheduling for Effective Monitoring of Dynamical Systems.用于动态系统有效监测的深度强化学习传感器调度
Syst Sci Control Eng. 2024;12(1). doi: 10.1080/21642583.2024.2329260. Epub 2024 Apr 23.
4
Reinforcement Learning Data-Acquiring for Causal Inference of Regulatory Networks.用于调控网络因果推断的强化学习数据获取
Proc Am Control Conf. 2023 May-Jun;2023:3957-3964. doi: 10.23919/acc55779.2023.10155867. Epub 2023 Jul 3.
5
An efficient solution to Hidden Markov Models on trees with coupled branches.一种针对具有耦合分支的树上隐马尔可夫模型的有效解决方案。
ArXiv. 2024 Jun 3:arXiv:2406.01663v1.
6
Retrospective Inference as a Form of Bounded Rationality, and Its Beneficial Influence on Learning.作为有限理性形式的追溯性推理及其对学习的有益影响。
Front Artif Intell. 2020 Feb 18;3:2. doi: 10.3389/frai.2020.00002. eCollection 2020.
7
Bayesian Inference and Online Learning in Poisson Neuronal Networks.泊松神经元网络中的贝叶斯推理与在线学习
Neural Comput. 2016 Aug;28(8):1503-26. doi: 10.1162/NECO_a_00851. Epub 2016 Jun 27.
8
Bayesian Lookahead Perturbation Policy for Inference of Regulatory Networks.贝叶斯前瞻微扰策略在调控网络推断中的应用。
IEEE/ACM Trans Comput Biol Bioinform. 2024 Sep-Oct;21(5):1504-1517. doi: 10.1109/TCBB.2024.3402220. Epub 2024 Oct 9.
9
Semi-supervised adapted HMMs for P2P credit scoring systems with reject inference.用于具有拒绝推断的P2P信用评分系统的半监督自适应隐马尔可夫模型
Comput Stat. 2023;38(1):149-169. doi: 10.1007/s00180-022-01220-9. Epub 2022 May 14.
10
Kernel-Based Particle Filtering for Scalable Inference in Partially Observed Boolean Dynamical Systems.基于核的粒子滤波在部分观测布尔动力系统中的可扩展推理
IFAC Pap OnLine. 2024;58(15):1-6. doi: 10.1016/j.ifacol.2024.08.495. Epub 2024 Sep 19.

本文引用的文献

1
An optimal Bayesian intervention policy in response to unknown dynamic cell stimuli.一种针对未知动态细胞刺激的最优贝叶斯干预策略。
Inf Sci (N Y). 2024 May;666. doi: 10.1016/j.ins.2024.120440. Epub 2024 Mar 7.
2
Bayesian Optimization for State and Parameter Estimation of Dynamic Networks with Binary Space.用于具有二元空间的动态网络状态和参数估计的贝叶斯优化
Control Technol Appl. 2024 Aug;2024:400-406. doi: 10.1109/ccta60707.2024.10666595. Epub 2024 Sep 11.
3
Optimal Inference of Hidden Markov Models Through Expert-Acquired Data.
通过专家获取的数据对隐马尔可夫模型进行最优推断。
IEEE Trans Artif Intell. 2024 Aug;5(8):3985-4000. doi: 10.1109/tai.2024.3358261. Epub 2024 Jan 24.
4
Bayesian reinforcement learning for navigation planning in unknown environments.用于未知环境中导航规划的贝叶斯强化学习
Front Artif Intell. 2024 Jul 4;7:1308031. doi: 10.3389/frai.2024.1308031. eCollection 2024.
5
Deep Reinforcement Learning Sensor Scheduling for Effective Monitoring of Dynamical Systems.用于动态系统有效监测的深度强化学习传感器调度
Syst Sci Control Eng. 2024;12(1). doi: 10.1080/21642583.2024.2329260. Epub 2024 Apr 23.
6
Resolving uncertainty on the fly: modeling adaptive driving behavior as active inference.即时解决不确定性:将适应性驾驶行为建模为主动推理。
Front Neurorobot. 2024 Mar 21;18:1341750. doi: 10.3389/fnbot.2024.1341750. eCollection 2024.
7
Modeling Defensive Response of Cells to Therapies: Equilibrium Interventions for Regulatory Networks.建模细胞对疗法的防御反应:调控网络的平衡干预。
IEEE/ACM Trans Comput Biol Bioinform. 2024 Sep-Oct;21(5):1322-1334. doi: 10.1109/TCBB.2024.3383814. Epub 2024 Oct 9.
8
Active Learning for Discrete Latent Variable Models.离散潜在变量模型的主动学习
Neural Comput. 2024 Feb 16;36(3):437-474. doi: 10.1162/neco_a_01646.
9
Driving Intention Recognition of Surrounding Vehicles Based on a Time-Sequenced Weights Hidden Markov Model for Autonomous Driving.基于时间序列权重隐马尔可夫模型的自动驾驶周围车辆驾驶意图识别
Sensors (Basel). 2023 Oct 27;23(21):8761. doi: 10.3390/s23218761.
10
Learning to Fight Against Cell Stimuli: A Game Theoretic Perspective.从博弈论视角看学习对抗细胞刺激
2023 IEEE Conf Artif Intell (2023). 2023 Jun;2023:285-287. doi: 10.1109/cai54212.2023.00127. Epub 2023 Aug 2.