Machi Dustin, Bhattacharya Parantapa, Hoops Stefan, Chen Jiangzhuo, Mortveit Henning, Venkatramanan Srinivasan, Lewis Bryan, Wilson Mandy, Fadikar Arindam, Maiden Tom, Barrett Christopher L, Marathe Madhav V
medRxiv. 2021 Feb 26:2021.02.23.21252325. doi: 10.1101/2021.02.23.21252325.
The COVID-19 global outbreak represents the most significant epidemic event since the 1918 influenza pandemic. Simulations have played a crucial role in supporting COVID-19 planning and response efforts. Developing scalable workflows to provide policymakers quick responses to important questions pertaining to logistics, resource allocation, epidemic forecasts and intervention analysis remains a challenging computational problem. In this work, we present scalable high performance computing-enabled workflows for COVID-19 pandemic planning and response. The scalability of our methodology allows us to run fine-grained simulations daily, and to generate county-level forecasts and other counter-factual analysis for each of the 50 states (and DC), 3140 counties across the USA. Our workflows use a hybrid cloud/cluster system utilizing a combination of local and remote cluster computing facilities, and using over 20,000 CPU cores running for 6-9 hours every day to meet this objective. Our state (Virginia), state hospital network, our university, the DOD and the CDC use our models to guide their COVID-19 planning and response efforts. We began executing these pipelines March 25, 2020, and have delivered and briefed weekly updates to these stakeholders for over 30 weeks without interruption.
新冠疫情的全球爆发是自1918年流感大流行以来最重大的疫情事件。模拟在支持新冠疫情的规划和应对工作中发挥了关键作用。开发可扩展的工作流程,以便为政策制定者提供针对物流、资源分配、疫情预测和干预分析等重要问题的快速响应,仍然是一个具有挑战性的计算问题。在这项工作中,我们展示了用于新冠疫情规划和应对的可扩展的高性能计算工作流程。我们方法的可扩展性使我们能够每天进行细粒度模拟,并为美国50个州(以及华盛顿特区)、3140个县中的每一个生成县级预测和其他反事实分析。我们的工作流程使用混合云/集群系统,结合了本地和远程集群计算设施,并每天使用超过20000个CPU核心运行6至9小时来实现这一目标。我们所在的州(弗吉尼亚州)、州医院网络、我们的大学、国防部和疾病控制与预防中心使用我们的模型来指导他们的新冠疫情规划和应对工作。我们于2020年3月25日开始执行这些流程,并且在超过30周的时间里不间断地向这些利益相关者提供并汇报每周更新情况。