Yudistira Novanto, Sumitro Sutiman Bambang, Nahas Alberth, Riama Nelly Florida
Faculty of Computer Science, University of Brawijaya, Indonesia.
Faculty of Mathematics and Natural Science, University of Brawijaya, Indonesia.
Appl Soft Comput. 2021 Sep;109:107469. doi: 10.1016/j.asoc.2021.107469. Epub 2021 May 7.
Determinant factors which contribute to the prediction should take into account multivariate analysis for capturing coarse-to-fine contextual information. From the preliminary descriptive analysis, it shows that environmental factor such as UV (ultraviolet) is one of the essential factors that should be considered to observe the COVID-19 epidemic drivers. Moreover, there are education, government, morphological, health, economic, and behavioral factors contributing to the growth of COVID-19. Besides descriptive analysis, in this research, multivariate analysis is considered to provide comprehensive explanations about factors contributing to pandemic dynamics. To achieve rich explanations, visual attribution of explainable Convolution-LSTM is utilized to see high contributing factors responsible for the growth of daily COVID-19 cases. Our model consists of 1 D CNN in the first layer to capture local relationships among variables followed by LSTM layers to capture local dependencies over time. It produces the lowest prediction errors compared to the other existing models. This permits us to employ gradient-based visual attribution for generating saliency maps for each time dimension and variable. These are then used for explaining which variables throughout which period of the interval is contributing for a given time-series prediction, likewise as explaining that during that time intervals were the joint contribution of most vital variables for that prediction. The explanations are useful for stakeholders to make decisions during and post pandemics. The explainable Convolution-LSTMcode is available here: https://github.com/cbasemaster/time-series-attribution.
有助于预测的决定因素应考虑多变量分析,以获取从粗略到精细的上下文信息。从初步的描述性分析来看,结果表明诸如紫外线(UV)等环境因素是观察新冠疫情驱动因素时应考虑的重要因素之一。此外,教育、政府、形态、健康、经济和行为等因素也对新冠疫情的增长有影响。除了描述性分析之外,本研究还考虑使用多变量分析来全面解释导致疫情动态变化的因素。为了获得丰富的解释,利用可解释卷积长短期记忆网络(Convolution-LSTM)的视觉归因来查看对每日新冠病例增长有高度贡献的因素。我们的模型在第一层由一维卷积神经网络(1D CNN)组成,用于捕捉变量之间的局部关系,随后是长短期记忆网络(LSTM)层,用于捕捉随时间的局部依赖性。与其他现有模型相比,它产生的预测误差最低。这使我们能够采用基于梯度的视觉归因,为每个时间维度和变量生成显著性图。然后,这些图用于解释在给定时间序列预测的整个时间间隔内,哪些变量在哪些时间段有贡献,同样也用于解释在哪些时间间隔内最重要变量的联合贡献对该预测起到作用。这些解释对于利益相关者在疫情期间及疫情后做出决策很有用。可解释卷积长短期记忆网络代码可在此处获取:https://github.com/cbasemaster/time-series-attribution 。