School of Data Science, City University of Hong Kong, Hong Kong, China.
Institute of Automation, Chinese Academy of Sciences, Beijing, China.
J Am Med Inform Assoc. 2023 Aug 18;30(9):1543-1551. doi: 10.1093/jamia/ocad116.
Long-lasting nonpharmaceutical interventions (NPIs) suppressed the infection of COVID-19 but came at a substantial economic cost and the elevated risk of the outbreak of respiratory infectious diseases (RIDs) following the pandemic. Policymakers need data-driven evidence to guide the relaxation with adaptive NPIs that consider the risk of both COVID-19 and other RIDs outbreaks, as well as the available healthcare resources.
Combining the COVID-19 data of the sixth wave in Hong Kong between May 31, 2022 and August 28, 2022, 6-year epidemic data of other RIDs (2014-2019), and the healthcare resources data, we constructed compartment models to predict the epidemic curves of RIDs after the COVID-19-targeted NPIs. A deep reinforcement learning (DRL) model was developed to learn the optimal adaptive NPIs strategies to mitigate the outbreak of RIDs after COVID-19-targeted NPIs are lifted with minimal health and economic cost. The performance was validated by simulations of 1000 days starting August 29, 2022. We also extended the model to Beijing context.
Without any NPIs, Hong Kong experienced a major COVID-19 resurgence far exceeding the hospital bed capacity. Simulation results showed that the proposed DRL-based adaptive NPIs successfully suppressed the outbreak of COVID-19 and other RIDs to lower than capacity. DRL carefully controlled the epidemic curve to be close to the full capacity so that herd immunity can be reached in a relatively short period with minimal cost. DRL derived more stringent adaptive NPIs in Beijing.
DRL is a feasible method to identify the optimal adaptive NPIs that lead to minimal health and economic cost by facilitating gradual herd immunity of COVID-19 and mitigating the other RIDs outbreaks without overwhelming the hospitals. The insights can be extended to other countries/regions.
长期非药物干预(NPIs)抑制了 COVID-19 的感染,但代价是巨大的经济成本,并且在大流行后呼吸道传染病(RIDs)爆发的风险增加。决策者需要数据驱动的证据来指导适应性 NPI 的放松,这些 NPI 考虑了 COVID-19 和其他 RIDs 爆发的风险,以及可用的医疗保健资源。
我们结合了 2022 年 5 月 31 日至 8 月 28 日香港第六波 COVID-19 数据、6 年其他 RIDs(2014-2019 年)的流行数据和医疗保健资源数据,构建了隔间模型来预测 COVID-19 靶向 NPI 后的 RIDs 流行曲线。开发了一个深度强化学习(DRL)模型,以学习最佳的适应性 NPI 策略,以在 COVID-19 靶向 NPI 解除后以最小的健康和经济成本减轻 RIDs 的爆发。通过从 2022 年 8 月 29 日开始的 1000 天模拟验证了该模型的性能。我们还将模型扩展到了北京的情况。
如果不采取任何 NPI,香港将经历一场远超医院床位容量的 COVID-19 疫情反弹。模拟结果表明,所提出的基于 DRL 的适应性 NPI 成功地抑制了 COVID-19 和其他 RIDs 的爆发,使其低于容量。DRL 谨慎地控制疫情曲线,使其接近满负荷运转,以便在相对较短的时间内以最小的成本实现群体免疫。DRL 在北京得出了更严格的适应性 NPI。
DRL 是一种可行的方法,可以通过促进 COVID-19 的逐步群体免疫和减轻其他 RIDs 的爆发,而不会使医院不堪重负,从而确定导致最小健康和经济成本的最佳适应性 NPI。这一见解可以扩展到其他国家/地区。