Ramchandani Ankit, Fan Chao, Mostafavi Ali
Department of Computer Science and EngineeringTexas A&M University College Station TX 77840 USA.
Zachry Department of Civil and Environmental EngineeringTexas A&M University College Station TX 77840 USA.
IEEE Access. 2020 Aug 28;8:159915-159930. doi: 10.1109/ACCESS.2020.3019989. eCollection 2020.
In this paper, we propose a deep learning model to forecast the range of increase in COVID-19 infected cases in future days and we present a novel method to compute equidimensional representations of multivariate time series and multivariate spatial time series data. Using this novel method, the proposed model can both take in a large number of heterogeneous features, such as census data, intra-county mobility, inter-county mobility, social distancing data, past growth of infection, among others, and learn complex interactions between these features. Using data collected from various sources, we estimate the range of increase in infected cases seven days into the future for all U.S. counties. In addition, we use the model to identify the most influential features for prediction of the growth of infection. We also analyze pairs of features and estimate the amount of observed second-order interaction between them. Experiments show that the proposed model obtains satisfactory predictive performance and fairly interpretable feature analysis results; hence, the proposed model could complement the standard epidemiological models for national-level surveillance of pandemics, such as COVID-19. The results and findings obtained from the deep learning model could potentially inform policymakers and researchers in devising effective mitigation and response strategies. To fast-track further development and experimentation, the code used to implement the proposed model has been made fully open source.
在本文中,我们提出了一种深度学习模型来预测未来几天新冠病毒感染病例的增加范围,并提出了一种新颖的方法来计算多元时间序列和多元空间时间序列数据的等维表示。使用这种新颖的方法,所提出的模型既可以纳入大量异构特征,如人口普查数据、县内流动性、县际流动性、社交距离数据、过去的感染增长情况等,并学习这些特征之间的复杂相互作用。利用从各种来源收集的数据,我们估计了美国所有县未来七天感染病例的增加范围。此外,我们使用该模型来识别对感染增长预测最具影响力的特征。我们还分析了特征对,并估计了它们之间观察到的二阶相互作用量。实验表明,所提出的模型获得了令人满意的预测性能和相当可解释的特征分析结果;因此,所提出的模型可以补充用于大流行(如新冠病毒)国家级监测的标准流行病学模型。从深度学习模型获得的结果和发现可能会为政策制定者和研究人员制定有效的缓解和应对策略提供信息。为了加快进一步的开发和实验,用于实现所提出模型的代码已完全开源。