Zhu George, Lizotte Dan, Hoey Jesse
School of Computer Science, University of Waterloo, 200 University Avenue W., Waterloo, Ontario, Canada N2L 1Z2.
School of Computer Science, University of Waterloo, 200 University Avenue W., Waterloo, Ontario, Canada N2L 1Z2.
Artif Intell Med. 2014 May;61(1):21-34. doi: 10.1016/j.artmed.2014.04.001. Epub 2014 Apr 13.
To demonstrate the feasibility of using stochastic simulation methods for the solution of a large-scale Markov decision process model of on-line patient admissions scheduling.
The problem of admissions scheduling is modeled as a Markov decision process in which the states represent numbers of patients using each of a number of resources. We investigate current state-of-the-art real time planning methods to compute solutions to this Markov decision process. Due to the complexity of the model, traditional model-based planners are limited in scalability since they require an explicit enumeration of the model dynamics. To overcome this challenge, we apply sample-based planners along with efficient simulation techniques that given an initial start state, generate an action on-demand while avoiding portions of the model that are irrelevant to the start state. We also propose a novel variant of a popular sample-based planner that is particularly well suited to the elective admissions problem.
Results show that the stochastic simulation methods allow for the problem size to be scaled by a factor of almost 10 in the action space, and exponentially in the state space. We have demonstrated our approach on a problem with 81 actions, four specialities and four treatment patterns, and shown that we can generate solutions that are near-optimal in about 100s.
Sample-based planners are a viable alternative to state-based planners for large Markov decision process models of elective admissions scheduling.
证明使用随机模拟方法解决在线患者入院调度大规模马尔可夫决策过程模型的可行性。
将入院调度问题建模为一个马尔可夫决策过程,其中状态表示使用多种资源中每种资源的患者数量。我们研究当前最先进的实时规划方法来计算该马尔可夫决策过程的解决方案。由于模型的复杂性,传统的基于模型的规划器在可扩展性方面受到限制,因为它们需要对模型动态进行显式枚举。为了克服这一挑战,我们应用基于样本的规划器以及高效的模拟技术,给定初始起始状态,按需生成动作,同时避开与起始状态无关的模型部分。我们还提出了一种流行的基于样本的规划器的新颖变体,它特别适合选择性入院问题。
结果表明,随机模拟方法在动作空间中可将问题规模扩大近10倍,在状态空间中呈指数级扩大。我们在一个具有81个动作、四个专科和四种治疗模式的问题上展示了我们的方法,并表明我们可以在大约100秒内生成接近最优的解决方案。
对于选择性入院调度的大型马尔可夫决策过程模型,基于样本的规划器是基于状态的规划器的可行替代方案。