Lee Elliot, Lavieri Mariel S, Volk Michael L, Xu Yongcai
The University of Michigan, 1205 Beal Ave., Ann Arbor, MI, 48109, USA,
Health Care Manag Sci. 2015 Sep;18(3):363-75. doi: 10.1007/s10729-014-9304-0. Epub 2014 Oct 12.
We investigate the problem faced by a healthcare system wishing to allocate its constrained screening resources across a population at risk for developing a disease. A patient's risk of developing the disease depends on his/her biomedical dynamics. However, knowledge of these dynamics must be learned by the system over time. Three classes of reinforcement learning policies are designed to address this problem of simultaneously gathering and utilizing information across multiple patients. We investigate a case study based upon the screening for Hepatocellular Carcinoma (HCC), and optimize each of the three classes of policies using the indifference zone method. A simulation is built to gauge the performance of these policies, and their performance is compared to current practice. We then demonstrate how the benefits of learning-based screening policies differ across various levels of resource scarcity and provide metrics of policy performance.
我们研究了一个医疗系统面临的问题,该系统希望在有患某种疾病风险的人群中分配其有限的筛查资源。患者患该疾病的风险取决于其生物医学动态。然而,系统必须随着时间的推移来了解这些动态。设计了三类强化学习策略来解决这个同时在多个患者中收集和利用信息的问题。我们基于肝细胞癌(HCC)筛查进行了一个案例研究,并使用无差异区域方法对这三类策略中的每一类进行了优化。构建了一个模拟来评估这些策略的性能,并将它们的性能与当前的做法进行比较。然后,我们展示了基于学习的筛查策略的益处如何在不同程度的资源稀缺情况下有所不同,并提供了政策性能的指标。