用于隐马尔可夫模型贝叶斯推理的深度强化学习数据收集

Deep Reinforcement Learning Data Collection for Bayesian Inference of Hidden Markov Models.

作者信息

Alali Mohammad, Imani Mahdi

机构信息

Department of Electrical and Computer Engineering at Northeastern University.

出版信息

IEEE Trans Artif Intell. 2025 May;6(5):1217-1232. doi: 10.1109/tai.2024.3515939. Epub 2024 Dec 12.

DOI:10.1109/tai.2024.3515939

PMID:40313356

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12045110/

Abstract

Hidden Markov Models (HMMs) are a powerful class of dynamical models for representing complex systems that are partially observed through sensory data. Existing data collection methods for HMMs, typically based on active learning or heuristic approaches, face challenges in terms of efficiency in stochastic domains with costly data. This paper introduces a Bayesian lookahead data collection method for inferring HMMs with finite state and parameter spaces. The method optimizes data collection under uncertainty using a belief state that captures the joint distribution of system states and models. Unlike traditional approaches that prioritize short-term gains, this policy accounts for the long-term impact of data collection decisions to improve inference performance over time. We develop a deep reinforcement learning policy that approximates the optimal Bayesian solution by simulating system trajectories offline. This pre-trained policy can be executed in real-time, dynamically adapting to new conditions as data is collected. The proposed framework supports a wide range of inference objectives, including point-based, distribution-based, and causal inference. Experimental results across three distinct systems demonstrate significant improvements in inference accuracy and robustness, showcasing the effectiveness of the approach in uncertain and data-limited environments.

摘要

隐马尔可夫模型（HMMs）是一类强大的动态模型，用于表示通过感官数据部分观测到的复杂系统。现有的HMM数据收集方法通常基于主动学习或启发式方法，在具有高成本数据的随机领域中，这些方法在效率方面面临挑战。本文介绍了一种用于推断具有有限状态和参数空间的HMM的贝叶斯前瞻数据收集方法。该方法使用捕捉系统状态和模型联合分布的信念状态，在不确定性下优化数据收集。与优先考虑短期收益的传统方法不同，此策略考虑了数据收集决策的长期影响，以随着时间的推移提高推断性能。我们开发了一种深度强化学习策略，通过离线模拟系统轨迹来近似最优贝叶斯解。这种预训练策略可以实时执行，随着数据的收集动态适应新情况。所提出的框架支持广泛的推断目标，包括基于点的、基于分布的和因果推断。在三个不同系统上的实验结果表明，在推断准确性和鲁棒性方面有显著提高，展示了该方法在不确定和数据有限环境中的有效性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于隐马尔可夫模型贝叶斯推理的深度强化学习数据收集

Deep Reinforcement Learning Data Collection for Bayesian Inference of Hidden Markov Models.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

用于隐马尔可夫模型贝叶斯推理的深度强化学习数据收集

Deep Reinforcement Learning Data Collection for Bayesian Inference of Hidden Markov Models.

作者信息

机构信息

出版信息

相似文献

本文引用的文献