1 Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford , Oxford OX3 7LF , UK.
2 Department of Biostatistics, Vanderbilt University , Nashville, TN 37203 , USA.
Philos Trans R Soc Lond B Biol Sci. 2019 Jul 8;374(1776):20180277. doi: 10.1098/rstb.2018.0277.
The number of all possible epidemics of a given infectious disease that could occur on a given landscape is large for systems of real-world complexity. Furthermore, there is no guarantee that the control actions that are optimal, on average, over all possible epidemics are also best for each possible epidemic. Reinforcement learning (RL) and Monte Carlo control have been used to develop machine-readable context-dependent solutions for complex problems with many possible realizations ranging from video-games to the game of Go. RL could be a valuable tool to generate context-dependent policies for outbreak response, though translating the resulting policies into simple rules that can be read and interpreted by human decision-makers remains a challenge. Here we illustrate the application of RL to the development of context-dependent outbreak response policies to minimize outbreaks of foot-and-mouth disease. We show that control based on the resulting context-dependent policies, which adapt interventions to the specific outbreak, result in smaller outbreaks than static policies. We further illustrate two approaches for translating the complex machine-readable policies into simple heuristics that can be evaluated by human decision-makers. This article is part of the theme issue 'Modelling infectious disease outbreaks in humans, animals and plants: epidemic forecasting and control'. This theme issue is linked with the earlier issue 'Modelling infectious disease outbreaks in humans, animals and plants: approaches and important themes'.
给定传染病在给定景观上可能发生的所有可能流行病的数量对于现实世界复杂系统来说是巨大的。此外,不能保证平均而言对所有可能的流行病都是最优的控制措施对于每个可能的流行病也是最佳的。强化学习(RL)和蒙特卡罗控制已被用于为具有许多可能实现的复杂问题开发机器可读的上下文相关解决方案,这些问题的范围从视频游戏到围棋游戏。RL 可能是生成爆发应对的上下文相关策略的有价值工具,尽管将产生的策略转换为人类决策者可以阅读和解释的简单规则仍然是一个挑战。在这里,我们说明了 RL 在开发最小化口蹄疫爆发的上下文相关爆发应对策略中的应用。我们表明,基于特定爆发自适应干预措施的基于结果的上下文相关策略的控制导致的爆发比静态策略小。我们进一步说明了将复杂的机器可读策略转换为人类决策者可以评估的简单启发式的两种方法。本文是主题为“人类、动物和植物传染病爆发建模:流行预测和控制”的一部分。这个主题与之前的主题“人类、动物和植物传染病爆发建模:方法和重要主题”有关。