Alali Mohammad, Imani Mahdi
Department of Electrical and Computer Engineering at Northeastern University.
IEEE Trans Artif Intell. 2025 May;6(5):1217-1232. doi: 10.1109/tai.2024.3515939. Epub 2024 Dec 12.
Hidden Markov Models (HMMs) are a powerful class of dynamical models for representing complex systems that are partially observed through sensory data. Existing data collection methods for HMMs, typically based on active learning or heuristic approaches, face challenges in terms of efficiency in stochastic domains with costly data. This paper introduces a Bayesian lookahead data collection method for inferring HMMs with finite state and parameter spaces. The method optimizes data collection under uncertainty using a belief state that captures the joint distribution of system states and models. Unlike traditional approaches that prioritize short-term gains, this policy accounts for the long-term impact of data collection decisions to improve inference performance over time. We develop a deep reinforcement learning policy that approximates the optimal Bayesian solution by simulating system trajectories offline. This pre-trained policy can be executed in real-time, dynamically adapting to new conditions as data is collected. The proposed framework supports a wide range of inference objectives, including point-based, distribution-based, and causal inference. Experimental results across three distinct systems demonstrate significant improvements in inference accuracy and robustness, showcasing the effectiveness of the approach in uncertain and data-limited environments.
隐马尔可夫模型(HMMs)是一类强大的动态模型,用于表示通过感官数据部分观测到的复杂系统。现有的HMM数据收集方法通常基于主动学习或启发式方法,在具有高成本数据的随机领域中,这些方法在效率方面面临挑战。本文介绍了一种用于推断具有有限状态和参数空间的HMM的贝叶斯前瞻数据收集方法。该方法使用捕捉系统状态和模型联合分布的信念状态,在不确定性下优化数据收集。与优先考虑短期收益的传统方法不同,此策略考虑了数据收集决策的长期影响,以随着时间的推移提高推断性能。我们开发了一种深度强化学习策略,通过离线模拟系统轨迹来近似最优贝叶斯解。这种预训练策略可以实时执行,随着数据的收集动态适应新情况。所提出的框架支持广泛的推断目标,包括基于点的、基于分布的和因果推断。在三个不同系统上的实验结果表明,在推断准确性和鲁棒性方面有显著提高,展示了该方法在不确定和数据有限环境中的有效性。